CONTEXTUAL  VARIABILITY  IN  THE  TRANSFER  OF  PROBLEM  SOLVING 

SKILLS 


By 


RYAN  WEST 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL  OF  THE 
UNIVERISTY  OF  FLORIDA  IN  PARTIAL  FULFULLMENT  OF  THE 
REQUIREMENTS  FOR  THE  DEGREE  OF  DOCTOR  OF  PHILOSOPY 

UNIVERSITY  OF  FLORIDA 


2002 


TABLE  OF  CONTENTS 


Page 

ABSTRACT “ 

INTRODUCTION ^ 

THEORY 

EXPERIMENT  I 22 

Method  23 

Results 29 

Discussion ^2 

EXPERIMENT  2 45 

Method  47 

Results 48 

Discussion 

EXPERIMENT  3 65 

Method  67 

Results 20 

Transfer  Results 72 

Transfer  Discussion 73 

Matching  Results 76 

Matching  Discussion  80 

DISCUSSION 94 

REFERENCES 107 

BIOGRAPHICAL  SKETCH Ill 

ii 


Abstract  of  Dissertation  Presented  to  the  Graduate  School  of  the  University  of  Florida  in 
Partial  Fulfillment  of  the  Requirements  for  the  Degree  of  Doctor  of  Philosophy 

CONTEXTUAL  VARIABILITY  IN  THE  TRANSFER  OF  PROBLEM  SOLVING 

SKILLS 

By 

Ryan  West 
August,  2002 

Chair:  Robert  D.  Sorkin 
Major  Department:  Psychology 

This  study  explored  recognition  processes  involved  in  the  early  stages  of  transfer. 
The  purpose  of  this  study  was  to  describe  how  individuals  learn  from  examples  and 
retrieve  known  problems  to  help  solve  new  ones.  Ninety-four  participants  from  the 
University  of  Florida  and  surrounding  community  participated  in  3 studies  that  employed 
transfer  and  similarity-matching  tasks  to  investigate  the  relationship  between  contextual 
variability  in  practice  and  transfer.  A training  group  given  contextual  variability  in 
practice  was  more  likely  to  solve  transfer  problems  accurately  and  recognize  learned 
principles  embedded  in  novel  problems  compared  to  controls.  In  addition,  solvers  who 
did  not  receive  contextual  variability  in  training  but  solved  a simple  transfer  problem 
showed  improved  transfer  and  recognition  of  embedded  principles.  It  was  believed  that 
variation  across  surface  features  in  problems  deemphasized  the  utility  of  perceptual 
surface  features  and  emphasized  abstract  but  consistent  information  in  problems  used  to 
categorize  problems.  A learning  mechanism  rooted  in  activation-based  learning  was 


described  to  suggest  how  problem  retrieval  and  transfer  performance  could  be  related  to 
problems  over  which  the  skill  was  developed.  The  implications  of  these  studies  are 
important  for  current  theories  of  transfer  and  provide  insights  into  the  differences 
between  experts  and  novices  in  categorization,  representation,  and  problem  solving. 


IV 


CONTEXTUAL  VARIABILITY  IN  THE  TRANSFER  OF  PROBLEM  SOLVING 

SKILLS 

This  study  explored  recognition  processes  involved  in  the  early  stages  of  transfer. 
The  purpose  of  this  study  was  to  describe  how  individuals  learn  from  examples  and 
retrieve  known  problems  to  help  solve  new  ones.  A learning  mechanism  is  described  to 
suggest  how  episode  retrieval  and  transfer  performance  are  related  to  the  problems  over 
which  the  skill  was  developed.  This  mechanism  offers  an  account  of  the  shift  in 
categorization  and  transfer  strategies  observed  between  experts  and  novices  in  problem 
solving. 

One  of  the  most  remarkable  features  of  the  cognitive  system  is  its  ability  to  solve 
novel  problems.  With  a finite  base  of  knowledge,  individuals  are  able  solve  a seemingly 
infinite  number  of  new  problems  with  ease.  The  cognitive  system  achieves  this  by 
transferring  knowledge  obtained  in  past  situations  to  aid  performance  in  new  situations. 

This  paper  begins  with  a definition  of  transfer  and  describes  the  importance  of 
similarity  between  problems  stored  in  memory  and  the  solver’s  representation  of  a new 
problem.  The  influence  of  internal  representations  in  problem  solving  is  discussed  with 
respect  to  expert/novice  differences  in  transfer  and  categorization  studies.  A comparison 
of  the  current  theories  of  transfer  follows.  Finally,  contextual  variability  is  introduced  as 
a factor  that  may  resolve  differences  between  transfer  theories  and  account  for  the  shift 
from  novice  to  expert  strategies. 


1 


2 


It  can  be  argued  that  transfer  and  learning  are  theoretically  identical  (Kimball  & 
Holyoak,  2000).  Transfer  can  be  defined  as  the  degree  to  which  previous  exposure  to  a 
base  task  affects  performance  on  a target  task  that  is  somehow  related.  The  two  tasks 
may  be  identical,  in  which  case  transfer  is  a measure  of  learning,  or  they  may  vary  on  a 

number  of  dimensions  to  various  degrees. 

The  nature  of  the  transfer  may  be  positive  or  negative  depending  on  the  effect  the 
base  task  has  on  the  target  task.  If  exposure  to  the  base  task  facilitates  performance  on 
the  target  task,  then  transfer  is  positive.  If  exposure  to  the  base  task  hinders  performance 
on  the  target  task,  transfer  is  negative.  For  example,  if  one  is  given  the  task  to  classify 
animals  and  learns  to  categorize  animals  that  fly  as  birds,  this  results  in  positive  transfer 
when  confronted  with  sparrows  and  finches,  but  results  in  negative  transfer  when 
encountering  mosquitoes  or  penguins. 

Before  solving  a problem,  it  is  presumed  an  individual  forms  a mental 
representation  of  the  problem  (Newell  & Simon,  1972;  Larkin,  McDermott,  Simon,  & 
Simon,  1980;  Chi,  Feltovich,  & Glaser,  1981).  This  representation  includes  what  are 
believed  to  be  the  salient  elements  of  the  problem,  the  organization  of  the  problem, 
constraints  on  solutions,  and  relevant  knowledge  retrieved  from  memory  that  was  not 
explicitly  stated  in  the  problem. 

Studies  on  transfer  in  isomorphic  problem  solving  (Kotovsky,  Hayes,  & Simon, 
1985)  and  analogical  reasoning  (Novick,  1988;  Zamani  & Richard,  2000)  suggest  that  the 
magnitude  and  direction  of  transfer  depends  on  the  perceived  similarity  between  the  base 
and  target  tasks.  The  greater  the  similarity  between  the  solver’s  representation  of  the 
current  task  and  the  representation  of  some  task  in  memory,  the  greater  the  influence  of 


transfer.  The  issue  of  perceived  similarity  is  an  important  one  and  differs  from  objective 
similarity.  While  the  mechanisms  involved  in  the  perception  of  similarity  are  debatable 
(Tversky,  1977),  they  appear  to  be  influenced  by  the  context  around  the  problem  and  the 
prior  knowledge  of  the  solver. 

The  characteristics  of  the  representations  formed  by  solvers  vary  with  expertise  in 
the  problem  domain.  For  example,  Chi  et  al.  (1981)  found  that  physics  experts  and 
novice  physics  students  categorize  physics  problems  by  different  criteria.  The  novices 
tended  to  group  problems  based  on  surface  features  contained  in  the  problems  such  as 
pulleys,  planes,  etc.  Physics  experts,  however,  tended  to  group  problems  based  on  how 
they  would  be  solved  by  structural  or  functional  similarity.  Experts  grouped  problems 
based  on  the  principles  they  illustrated  such  as  Newton’s  Second  Law.  The  authors 
concluded  that  experts  possess  well-organized  memory  structures  or  schemas  that  support 
their  advanced  knowledge  and  categorize  problems  based  on  the  problem  schema  they 
represent.  Adelson  (1981)  found  similar  results  with  expert  and  novice  computer 
programmers  in  their  organization  and  recall  of  lines  of  programming  code. 

Studies  such  as  these  suggest  that  as  one  develops  familiarity  with  a domain,  the 
knowledge  structures  that  one  associates  with  that  domain  evolve  in  their  organization 
and  complexity.  Novices  seem  to  associate  information  around  simple  concepts  or 
surface  level  features.  Experts  associate  information  in  relation  to  structures  that  are 
broad  and  elaborate  in  their  organization.  They  are  able  to  recognize  embedded 
principles.  These  two  studies  exemplify  the  differences  found  in  expert/novice 
categorization  in  other  domains  including  x-ray  diagnosis  (Lesgold,  Rubinson,  Feltovich, 
Glaser,  Klopher,  & Wang,  1988),  and  mathematics  (Schoenfeld  & Hermann,  1982; 


4 

Hardiman,  Dufresne,  & Mestre,  1989).  However,  exactly  how  experts  develop  schemas 
is  not  well  understood. 

The  representations  solvers  form  of  a problem  is  also  related  to  their  difficulty 
with  the  problem.  Representation  is  critical  for  deriving  a solution.  Accurately 
categorizing  a problem  as  a certain  type  often  suggests  possible  solutions  and  facilitates 
problem  solving  (Newell  & Simon,  1 972;  Blessing  & Ross,  1 996).  In  studies  on 
isomorphic  problem  solving,  the  representation  of  a problem  was  found  to  be  a deciding 
factor  in  the  difficulty  of  the  solution  (Kotovsky,  Hayes,  & Simon,  1985;  Zhang,  1997). 

The  representation  of  a problem  in  working  memory  may  cue  associated 
information  that  is  relevant  to  the  problem.  In  this  way,  the  representation  of  the  problem 
that  the  solver  works  with  includes  information  obtained  from  the  problem  and 
knowledge  retrieved  from  memory.  Returning  to  transfer,  it  seems  that  the  representation 
of  the  target  problem  may  provide  retrieval  cues  for  the  activation  and  access  of  similar 
problems  in  memory.  These  retrieval  cues  may  include  perceptual  features,  semantic 
associations,  problem  constraints,  or  goal  structures.  Retrieved  problems  may  aid 
problem  solving  by  activating  prototypical  schema,  associated  solution  procedures,  or 
general  strategies.  For  novices,  this  would  suggest  the  retrieval  of  problems  with  similar 
surface  features.  Experts,  however,  may  be  able  to  use  surface  features  and  embedded 
principles  in  the  problem  to  retrieve  analogous  problems  or  schemas. 

Novick  (1988)  tested  this  assumption  with  arithmetic  word  problems.  In  this 
study,  both  advanced  and  novice  math  students  studied  example  problems  and  their 
solutions.  They  were  then  given  test  problems.  An  examination  of  solution  procedures 
showed  that  novices  often  attempted  procedures  borrowed  from  problems  with  similar 


5 


surface  level  features.  This  resulted  in  negative  transfer  when  problems  shared  similar 
surface  features  but  required  different  solution  procedures.  However,  if  the  base  and 
target  problems  shared  similar  surface  features  and  solution  procedures,  transfer  was 
positive.  Advanced  students  who  compared  and  contrasted  solution  procedures  showed 
positive  transfer  with  minimal  influence  of  surface  features. 

Expertise,  however,  does  not  guarantee  transfer  of  skill.  In  fact,  it  has  often  been 
shown  that  increased  familiarity  within  a domain  leads  to  increased  specificity  of  skill 
(Sternberg,  1996).  Improved  performance,  in  some  cases,  comes  at  the  expense  of 
generalizability.  Wiley  (1998)  demonstrated  that  individuals  with  a high  degree  of 
domain  knowledge  were  less  able  to  solve  problems  that  involved  domain  related  items 
but  required  non-domain  solutions.  Apparently  once  these  experts’  problem  schemas 
were  cued  by  the  items  in  the  problem,  they  fixated  on  a limited  range  of  solutions  that 
were  incorrect.  Similar  studies  have  shown  expertise  to  act  as  a mental  set  in  other 
domains  such  as  troubleshooting  in  electronics  (Besnard  & Bastien-Toniazzo,  1999). 

It  could  be  the  case  that  the  critical  factor  for  successful  transfer  is  not  expertise 
within  a domain  but  something  else  that  is  often  coupled  with  expertise.  Studies  show 
that  variability  in  the  context  in  which  a skill  is  practiced  often  improves  transfer  of  that 
skill  to  novel  tasks.  For  example,  Catalano  and  Kleiner  (1984)  used  a coincidence-timing 
task  that  had  participants  press  a button  when  a moving  object  passed  over  a set  marker. 
Individuals  who  practiced  the  task  with  varying  object  speeds  performed  better  than 
individuals  who  practice  at  a constant  speed  when  both  groups  were  later  tested  at  a new 
speed.  Contextual  variability  has  been  found  to  improve  transfer  in  other  motor  tasks 
(Kerr  & Booth,  1978)  and  verbal  tasks  (Bransford,  Franks,  Morris,  & Stein,  1979). 


6 


However,  these  studies  did  not  directly  manipulate  the  surface  and  abstract  characteristics 
of  the  base  and  transfer  task  as  other  studies  of  transfer  have.  Nor  did  they  focus  on 
problem  solving  skills. 

Some  authors  suggest  that  it  is  the  contextual  variability  that  experts  encounter 
over  the  course  of  their  training  that  is  responsible  for  their  improved  transfer  (Kimball  &. 
Holyoak,  2000).  With  practice,  experts  develop  schemas  that  help  them  encode  and 
retrieve  large  amounts  of  information  very  efficiently.  Some  schemas  may  describe 
categories  of  problems  by  their  features  and  solutions  in  a way  that  facilitates  problem 
solving  under  routine  circumstances.  However,  there  is  the  potential  for  these  schemas  to 
become  restrictive  if  they  develop  over  a narrow  range  of  scenarios. 

In  the  Novick  (1988)  study,  advanced  math  students  showed  positive  transfer  to 
novel  math  problems.  However,  it  can  be  argued  that  all  the  problems  were  within  their 
domain  of  expertise.  This  is  typical  of  many  studies  comparing  expert  and  novice 
performance  where  confounding  factors  that  include  aptitude,  years  of  education,  or  age 
are  overlooked. 

There  have  been  no  studies  to  specifically  examine  the  effect  of  contextual 
variability  on  transfer  that  take  into  account  surface  level  and  structural  similarities  of  the 
target  problems.  It  is  not  clear  to  what  degree  variability  in  practice  facilitates  transfer  to 
other  problems.  Studies  of  contextual  variability  in  learning  have  found  improved 
transfer  but  have  not  manipulated  the  levels  of  similarity  between  the  base  domain  and 
target  domain.  Studies  of  analogical  transfer  and  problem  solving  have  traditionally 
ignored  expert/novice  differences.  Novick  (1988)  is  a rare  contradictory  example. 


7 


However,  in  light  of  studies  that  show  some  skills  to  be  domain  specific,  perhaps  it  is 
more  appropriate  to  examine  contextual  variability  rather  than  expertise  in  transfer. 

If  contextual  variability  in  practice  yields  results  on  a transfer  task  that  are  similar 
to  the  results  from  studies  on  expertise  and  transfer,  then  perhaps  there  is  value  in  the 
claim  that  variability  in  practice  is  more  important  than  amount  of  practice  for  transfer. 
Furthermore,  contextual  variability  may  suggest  a process  by  which  experts  develop 
schemas  with  abstract  features. 

Another  unresolved  issue  concerns  the  mechanisms  through  which  contextual 
variability  could  facilitate  a shift  in  problem  solving  strategy.  Current  theories  of  transfer 
developed  primarily  from  attempts  to  describe  mechanisms  through  which  base  problems 
are  mapped  onto  target  problems  to  transfer  a solution.  These  theories  have  not  addressed 
the  changing  influence  of  surface  details  and  causal  information  across  levels  of  expertise 
or  variability  of  practice.  In  fact,  some  theories  of  transfer  focus  on  the  importance  of 
abstract  features  while  others  focus  on  surface  features.  Although  each  side  is  supported 
by  data,  there  has  been  no  explanation  for  a shift  from  reliance  on  surface  features  to 
abstract  features.  The  lack  of  explanation  in  how  transfer  strategies  develop  illustrates  a 
gap  in  the  broader  understanding  of  transfer  (Kimball  & Holyoak,  2000;  Reeves  & 
Weisberg,  1994).  In  this  study,  a learning  mechanism  is  proposed  that  describes  how 
repeated  encounters  with  a basic  principle  over  a range  of  problems  shifts  attention  from 
surface  features  to  subtle  but  consistent  structural  information  in  the  problem. 

The  consensus  among  various  theories  of  transfer  is  that  the  process  of  transfer 
requires  at  least  three  stages  (Reeves  & Weisberg,  1994;  Novick,  1988).  The  first  two 
stages  reflect  transfer’s  inherent  dependency  on  memory  and  the  final  stage  entails  the 


8 


mechanisms  for  mapping  one  problem  onto  another.  The  first  stage  involves  the 
encoding  of  the  base  and  target  problems.  The  second  stage  involves  the  retrieval  of  a 
base  problem  when  presented  with  the  target.  This  stage  is  sometimes  separated  into  the 
activation  of  multiple  problems  and  the  selection  of  a single  one.  In  the  third  stage,  the 
base  analog  is  applied  or  mapped  to  the  target  to  produce  a solution  or  contribute 
additional  information. 

Theories  describing  the  process  of  analogical  transfer  generally  fall  into  one  of 
three  camps.  The  structure-mapping  theory  developed  by  Centner  and  her  colleagues 
(Centner,  1983;  Centner,  1989)  views  analogical  transfer  as  a process  of  mapping  the 
structural  or  semantic  relationships  of  a base  problem  onto  a target  to  provide  missing  or 
useful  information  such  as  a solution.  In  this  theory,  the  critical  mechanisms  of  transfer 
extract  the  structural  relationships  describing  a problem  such  as  causal  relations, 
hierarchical  organization,  and  semantic  relations  and  match  them  against  other  structural 
representations  of  problems  stored  in  memory. 

Centner  proposes  that  initial  matches  between  base  and  target  analogs  can  occur 
at  any  level  of  representation  from  surface  features  to  higher  order  relations.  Among  the 
potential  matches,  however,  the  selection  of  an  analog  for  mapping  depends  on  the 
highest  order  of  similarity  to  the  target.  Thus,  surface  features  may  be  useful  for  cueing 
potential  analogs  but  those  that  share  the  highest  level  of  abstract  similarity  dominate  the 
retrieval  and  mapping  stages. 

The  pragmatic  schema  theory  proposed  by  Holyoak  and  his  colleagues  (Cick  & 
Holyoak,  1983;  Holyoak  & Thagard,  1989a;  Holyoak  & Thagard,  1989b)  also  asserts  that 
the  selection  and  application  of  base  analogs  depend  on  information  abstracted  from  the 


9 


problem.  However,  the  pragmatic  schema  theory  claims  that  the  mapping  mechanisms  of 
transfer  depend  on  recognition  of  pragmatic  features  in  a problem  and  not  systematic 
relations.  These  pragmatic  features  include  the  goal  of  the  problem  and  factors  that 
constrain  possible  solutions. 

According  to  Holyoak,  the  retrieval  of  similar  analogs  relies  on  the  overlap  of 
surface  features,  solution  constraints,  or  goals  in  the  base  and  target  representations.  The 
selection  of  a single  schema  or  problem,  however,  is  based  on  the  degree  of  similarity, 
previous  usefulness,  and  the  relevance  of  the  base  problem  to  the  goals  of  the  target 
problem.  Similar  to  the  structure-mapping  theory,  surface  elements  may  be  useful  in 
cueing  potential  examples  but  retrieval  depends  on  higher  order  knowledge  about  the  two 
problems. 

In  contrast  to  both  the  structure-mapping  and  pragmatic  schema  theories  are 
exemplar-based  theories  exemplified  by  Ross’s  remindings  theory  (Ross,  1984;  Ross  & 
Kennedy,  1990).  Common  among  these  theories  is  a focus  on  lower  level  features  of 
problems  rather  than  abstract  features.  In  the  remindings  theory,  the  retrieval  of  an 
analog  begins  with  the  similarity  of  surface  level  details  in  the  target  and  base  problems. 
The  mapping  of  one  example  onto  another  occurs  by  a process  of  matching  corresponding 
elements  then  relations.  Within  this  theory,  it  is  possible  for  the  solver  to  learn  general 
categories  of  features  that  afford  the  recognition  of  similar  problems.  In  this  way, 
problem  schemas  are  developed  through  their  association  with  a broad  range  of  surface 
features.  Thus,  the  surface  level  details  of  the  base  representation  influence  all  stages  of 


transfer  directly  or  indirectly. 


10 


The  fundamental  difference  between  the  exemplar  theories  and  both  the  structure- 
mapping and  pragmatic  schema  theories  is  the  importance  of  surface  features  in  the 
transfer  process.  For  the  structure-mapping  and  pragmatic  schema  theory,  surface 
features  aid  only  in  the  retrieval  of  potential  base  analogs.  In  exemplar  theories,  surface 
features  influence  all  stages  of  transfer.  These  three  classes  of  theories  have  their 
strengths  and  weaknesses  in  explaining  data  collected  over  a range  of  analogical 
reasoning  studies.  Each  theory  successfully  explains  some  part  of  the  transfer  process  or 
transfer  under  certain  conditions.  Each  theory’s  explanation  is  distinctly  rooted  in  its 
assumption  about  the  knowledge  involved  in  transfer. 

Both  the  pragmatic  schema  and  structure-mapping  theory  require  knowledge  that 
represents  high  level  information  about  the  problem  which  novices  may  not  possess.  The 
pragmatic  schema  theory  requires  problem  schemas  that  organize  goal-relevant  and 
solution  constraining  information.  The  structure-mapping  theory  requires  knowledge  that 
represents  the  structural  relationships  among  the  elements  in  a problem.  In  contrast  to  a 
reliance  on  abstract  and  higher  order  information,  the  remindings  theory  is  based  on  the 
literal  content  of  problems  and  relies  on  memories  from  episodic  traces.  This  view, 
however,  does  not  consider  the  broader  role  of  schemas  that  characterizes  expertise. 

No  one  theory  encompasses  the  strengths  of  all  three  perspectives.  A 
comprehensive  theory  of  transfer  would  have  to  account  for  the  shifting  roles  that  surface 
level  and  structural  information  play  in  transfer  as  one’s  exposure  to  a range  of  problems 
increases.  The  evidence  shows  that  surface  level  information  plays  a significant  role  in 
transfer,  especially  for  novices.  However,  as  a flexible  expertise  in  a field  develops. 


11 


structural  knowledge  seems  to  dominate  and  afford  new  opportunities  for  categorization, 
representation,  and  transfer. 

As  stated  earlier,  the  process  of  transfer  is  widely  considered  to  have  at  least  three 
stages:  encoding  and  representation,  retrieval  and  selection,  then  mapping.  The  first  two 
stages  heavily  depend  on  memory;  therefore,  a comprehensive  theory  of  transfer  must 
incorporate  changes  that  occur  in  the  structure  of  knowledge  and  the  effect  these  changes 
have  on  transfer  performance. 

The  purpose  of  this  paper  is  to  explore  the  relationship  between  contextual 
variability  and  transfer.  The  relationship  between  expertise  and  transfer  is  not  always 
clear  and  perhaps  this  component  of  expertise  is  a critical  factor.  If  so,  this  may  help  to 
develop  our  understanding  of  how  people  develop  skill  in  problem  solving. 

Furthermore,  it  is  not  known  exactly  how  contextual  variability  improves  transfer. 
In  this  study,  a learning  mechanism  is  described  that  accounts  for  differences  in  transfer 
related  to  various  kinds  of  practice.  Such  a learning  mechanism  could  be  used  to  expand 
current  theories  of  transfer  and  explain  how  solution  strategies  shift  from  a dependence 
on  surface  features  to  structural  relations  as  one  is  exposed  to  broader  variations  of  the 


problem. 


THEORY 


The  view  proposed  in  this  study  is  that  differences  in  transfer  among  groups  of 
individuals  may  be  accounted  for  by  a learning  mechanism  that  focuses  on  the  base-level 
activations  and  activation  strengths  of  elements  associated  in  memory.  This  theory  takes 
an  extreme  view  in  that  it  focuses  entirely  on  bottom-up  recognition  processes  in  the 
retrieval  of  example  problems  in  the  early  stages  of  transfer.  Before  describing  the  theory 
proposed  here,  it  is  necessary  to  establish  a framework  for  representing  knowledge  and 
altering  that  representation  to  reflect  learning. 

A key  feature  of  this  theory  is  activation-based  learning.  Activation-based 
learning  is  inherent  in  connectionist  networks  and  has  been  incorporated  by  symbolic 
cognitive  architectures  such  as  ACT-R  (Anderson  & Lebiere,  1998).  The  core  of  this 
belief  is  that  knowledge  can  be  represented  by  a large  set  of  smaller  interconnected  units. 
In  symbolic  cognitive  architectures,  these  units  represent  factual  information  or  small 
chunks  of  knowledge.  In  connectionist  networks,  they  represent  sub-symbolic  elements. 
Each  of  these  units  is  assigned  a certain  energy  level,  or  activation  level.  The  activation 
level  of  a unit  determines  the  availability  of  that  information  for  processing.  Elements 
with  high  activation  levels  are  readily  available  while  elements  with  low  activation  levels 
may  not  be  accessed  at  all. 

In  addition  to  base-level  activations,  the  connections  among  elements  have 
activation  levels  describing  the  strength  of  association  between  them.  According  to  most 


12 


13 


theories  that  incorporate  spreading  activation,  the  association  strength  indicates  the 
amount  of  activation  that  one  element  passes  on  to  the  next.  As  a result,  the  total 
activation  level  of  a single  element  is  the  base-level  activation  plus  the  sum  of  the 
activation  it  receives  from  its  connections. 

The  adaptation  of  base-level  activations  and  activation  strengths  over  time  in 
neural  circuits  is  described  by  Hebbian  learning.  Hebbian  learning  was  inspired  by  the 
information  processing  characteristics  of  neurons.  Thus,  it  has  a strong  biological  and 
mathematical  base.  According  to  Hebb  (1949),  learning  occurs  through  biological 
changes  in  neural  circuits.  When  two  neurons  are  jointly  activated,  they  become  more 
closely  linked.  As  they  continue  to  be  jointly  activated,  the  efficiency  with  which  they 
transmit  impulses  increases.  In  this  way,  the  activation  levels  of  neural  patterns  change 
with  respect  to  the  environmental  conditions.  It  is  believed  this  is  the  process  of  learning 
at  the  neural  level. 

In  the  current  study,  the  exact  form  in  which  knowledge  is  represented  is  arbitrary 
and  could  be  expressed  in  a number  of  functionally  similar  ways.  Here  it  is  presumed 
that  knowledge  can  be  represented  in  a semantic  network  with  spreading  activation.  In  a 
semantic  network,  nodes  represent  meaningful  units  such  as  perceptual  objects  or 
concepts.  These  units  are  connected  by  links  that  describe  the  nature  of  the  relationship 
between  them. 

There  is  a slight  but  important  distinction  between  the  common  structure  of  a 
semantic  network  and  that  used  here.  It  is  important  to  consider  the  role  that  the  prior 
knowledge  of  the  solver  plays  in  solving  problems.  The  network  representation  of  the 
problem  in  the  view  proposed  here  includes  the  elements  given  in  the  problem  and 


14 


elements  activated  from  long  term  memory.  With  this  framework  established  for 
representing  knowledge  and  making  changes  to  that  representation  to  reflect  learning,  it  is 
possible  to  propose  a source  of  expert/novice  differences  and  the  effect  of  contextual 
variability  in  transfer. 

In  the  proposed  view,  each  problem  scenario  is  a unique  episode.  The  episode  is 
encoded  into  working  memory  as  a set  of  semantic  relations  that  describes  the  elements 
of  the  problem  and  the  relations  between  them.  For  the  sake  of  parsimony,  it  will  be 
assumed  that  the  working  memory  representation  is  successfully  stored  in  long-term 
memory  and  does  not  decay. 

If  the  problem  scenario  is  completely  unfamiliar,  all  elements  of  the  semantic 
representation  are  encoded  with  some  initial  base-level  of  activation.  Since  the  problem 
scenario  is  encoded  as  a unique  episode,  the  association  strengths  between  these  elements 
are  relatively  high  for  some  duration  preserving  the  episode  as  a whole. 

The  presumption  of  this  study  is  that  the  structural  information  that  defines  a 
problem  (causal  principles,  semantic  information,  goals,  etc.)  is  more  difficult  to  encode 
and  represent  than  perceptual  objects  in  a network.  Thus,  perceptual  features  may  be 
represented  with  higher  initial  activation  levels.  This  seems  justified  for  several  reasons. 
First,  perceptual  objects  may  be  easier  to  encode  because  they  are  directly  observable  in 
the  environment.  Structural  information  may  have  to  be  retrieved  or  inferred  by  a 
separate  set  of  mental  operations.  Secondly,  perceptual  objects  usually  persist  over  time 
offering  the  potential  for  repeated  encoding.  Structural  information  may  not  be  retrieved 
or  observed  as  often.  Furthermore,  structural  information  may  only  be  observed  as 


15 

changes  over  time  and  require  the  storage  and  comparison  of  changing  states  of  the 
problem. 

As  one  encounters  the  same  or  similar  problem  on  repeated  occasions,  the  surface 
level  features  of  the  problem  maintain  higher  activation  levels  than  the  structural 
information.  Thus,  the  accessibility  of  these  features,  as  base-level  activation  strengths, 
increases  at  a faster  rate  than  the  structural  information.  The  belief  here  is  that  the 
strongest  retrieval  cues  in  a target  representation  for  accessing  base  analogs  are  the  ones 
that  have  high  activation  levels  and  correspond  to  elements  in  the  base  analog. 

Information  in  memory  with  high  base-level  activation  is  more  likely  to  be 
retrieved  and  become  part  of  the  problem  representation  in  working  memory.  According 
to  spreading  activation,  information  that  is  highly  associated  with  the  representation  is 
more  likely  to  be  retrieved.  One  can  presume  that  if  the  solver  categorizes  problems 
using  structural  information  as  a criterion,  then  the  structural  information  primes 
problems  or  solutions  in  memory.  Associated  information  that  is  activated  into  working 
memory  broadens  the  problem  space  beyond  information  explicitly  given  in  the  problem. 
One  expects,  then,  that  the  representation  of  the  problem  in  working  memory  contains 
information  directly  available  in  the  problem  and  associated  information  retrieved  from 
long  term  memory. 

This  viewpoint  offers  an  explanation  for  the  novice  reliance  on  the  surfaee 
features  of  problems  when  transferring  solutions.  As  novices  learn  a domain,  they  learn 
the  features  and  objects  of  the  domain  first  and  with  the  most  ease.  Over  a course  of 
time,  novices  gain  familiarity  with  the  features  and  underlying  relationships  in  a problem 
as  the  base-level  activations  of  both  increase  in  their  internal  representations.  With 


16 


repeated  exposure  to  the  same  problems,  the  surface  features  may  persistently  provide 
sufficient  characterizations  of  the  problems  and  have  greater  salience  in  novice 
representation.  As  a result,  novices  attempt  to  transfer  the  solutions  from  problems  that 
have  similar  surface  features. 

For  experts  the  explanation  is  more  complicated.  This  theory  predicts  that  if  it  is 
the  case  that  expertise  in  a domain  has  developed  over  a limited  set  of  problems  where 
the  same  surface  features  are  encountered  with  the  same  abstract  relations,  then  these 
experts  should  exhibit  transfer  behavior  like  novices.  Such  experts  may  exhibit  superior 
recall,  recognition,  and  overall  familiarity  with  the  features  and  relations  of  a domain. 
However,  like  novices,  they  would  recognize  and  respond  to  surface  features  when 
attempting  to  transfer  solutions.  This  would  result  from  the  activation  levels  of  surface 
features  that  are  stronger  than  the  activation  levels  of  abstract  relations  in  their  long-term 
representation. 

For  novices  and  experts,  greater  familiarity  with  structural  relations  and  improved 
transfer  may  depend  on  the  range  of  episodes  they  encounter  in  the  domain.  It  is  known 
that  a high  degree  of  variation  in  the  training  one  receives  leads  to  improved  skill 
retention  and  transfer  (Mannes  & Kintsch,  1987;  Catalano  & Kleiner,  1984).  It  is 
possible  that  variation  in  training  improves  the  effectiveness  of  rules  or  schemata  that 
develop  with  practice.  The  variation  from  trial  to  trial  may  encourage  additional 
processing  concerning  the  relationships  among  the  variables  in  the  problem  (Schmidt  & 
Bjork,  1992). 

The  theory  proposed  here  explains  these  effects  in  terms  of  the  strengthening  of 
base-level  activations.  When  a problem  scenario  is  encountered  that  possesses  novel 


17 


surface  features  but  familiar  structural  features,  the  surface  features  are  encoded  as  new 
elements  with  initial  base-level  activations.  The  structural  features,  once  processed  and 
recognized,  are  strengthened  as  well.  In  this  way,  activation  levels  of  structural  features 
can  strengthen  beyond  the  surface  features  and  become  more  accessible  and  salient 
retrieval  cues  in  subsequent  problems. 

Next,  the  general  theory  outlined  above  will  be  described  in  enough  detail  that  it 
could  be  implemented  in  a simple  computational  model.  The  mechanism  is  not  complex 
enough  that  computer  simulations  are  required  to  determine  its  behavior.  Predictions  of 
the  mechanism  can  be  made  based  on  the  theory  outlined  above  alone.  However, 
providing  details  to  any  theory  at  the  level  necessary  for  implementation  in  a 
computational  model  enforces  a strong  empirical  rigor.  It  clearly  defines  processes  and 
declares  assumptions. 

The  specifications  for  this  learning  mechanism  provide  for  the  encoding  and 
representation  of  a problem  scenario,  the  recognition  of  familiar  elements,  activation  of 
associated  information,  and  the  adaptation  of  activation  levels  during  training.  Based  on 
the  strength  of  the  activation  levels  in  a semantically  associated  memory,  predictions 
could  be  made  regarding  which  features  of  a problem  would  be  recognized  with  the 
greatest  salience  in  the  problem  representation  and  which  base  analogs  would  be  selected 
for  mapping.  As  a result,  such  a model  makes  predictions  about  which  base  analog  stored 
in  memory  would  be  the  most  similar  to  any  given  target  problem. 

It  is  important  to  note  that  this  model  is  not  intended  to  provide  a complete 
account  of  the  transfer  process.  Rather,  it  offers  a mechanism  to  suggest  how  a shift  in 
problem  representation  and  strategy  takes  place  with  practice  in  a domain.  Other  theories 


18 


of  transfer  have  proposed  processes  that  describe  how  problems  are  solved  once  an 
analog  has  been  retrieved  but  none  sufficiently  account  for  the  differences  between 
experts  and  novices  or  the  shift  that  takes  place  between  them.  Thus,  this  mechanism  is 
intended  to  enhance  our  understanding  of  existing  theories  of  transfer  and  to  bridge 
structural  relational  theories  with  surface  feature  theories. 

This  mechanism  is  reminiscent  of  other  models  of  perception  and  memory.  One 
such  model  is  the  Elementary  Perceiver  and  Memorizer  (EPAM)  developed  originally  by 
Feigenbaum  (Feigenbaum  & Simon,  1984)  and  extended  by  others  (Richman,  Staszewski, 
&.  Simon,  1995).  The  key  difference  between  the  theory  described  here  and  EPAM  is  in 
the  learning  process.  EPAM  was  designed  to  learn  lists  of  items  and  to  discriminate 
patterns.  It  achieves  these  by  making  changes  to  a discrimination  net  it  uses  for  object 
recognition  and  a long-term  store  of  objects  and  properties.  EPAM’s  learning  process 
adds  objects  and  properties  to  its  long-term  store  to  increase  the  complexity  of  its 
knowledge  base.  The  learning  process  also  adds  new  feature  tests  to  the  discrimination 
net  to  improve  object  recognition. 

The  learning  process  described  by  the  current  mechanism  adds  items  to  memory 
during  encoding  like  EPAMj  however,  it  also  makes  changes  to  the  activation  levels  of 
elements  in  memory  to  reflect  learning.  Here,  the  activation  level  of  elements  and 
patterns  governs  recognition  and  not  a series  of  discrimination  tests  as  in  EPAM.  The 
learning  mechanism  explored  in  this  model  focuses  more  on  the  structure  of  elements  in 
memory  rather  than  the  number  of  them. 

As  stated  earlier,  the  similarity  that  a problem  solver  perceives  between  a target 
problem  and  a base  analog  is  a key  factor  to  the  likelihood  that  the  solver  will  attempt  a 


19 


transfer  solution.  In  the  current  theory,  it  is  presumed  that  the  overlap  of  highly  salient 
elements  in  the  target  and  base  representations  is  the  basis  for  perceived  similarity. 
Within  the  proposed  mechanism,  the  activation  level  of  elements  in  semantic  memory 
combined  with  the  representation  of  the  target  problem  should  predict  what  base 
problems  will  be  perceived  as  similar  and  retrieved  for  transfer. 

As  with  other  computer-based  semantic  network  systems,  scenarios  are  encoded 
by  breaking  them  down  into  nodes  representing  objects  and  relations  that  describe  the 
problem  as  a network  diagram.  Each  node  is  then  entered  into  a computational  program 
along  with  a list  of  nodes  that  connect  to  it.  In  this  way,  the  entire  problem  episode  can 
be  represented  as  a list  of  nodes  and  their  connecting  nodes. 

As  a rule,  nodes  that  represent  objects  have  higher  initial  base-level  activations 
than  nodes  that  represent  higher  order  relations.  This  reflects  the  assumption  in  the 
theory  that  perceptual  objects  are  easier  to  encode  and  represent.  By  default,  one  could 
assign  a base-level  activation  to  perceptual  objects  of  2 and  relations  of  1 . 

While  each  node  is  unique,  it  could  be  connected  to  another  by  an  abstract 
relationship  such  as  semantic  association.  For  example,  “apple.  1”  refers  to  a specific 
object,  but  would  be  recognized  as  an  association  of  “fruit”  in  general.  As  subsequent 
problem  scenarios  are  entered  as  a set  of  nodes,  the  mechanism  would  compare  each 
entered  node  to  the  existing  set  of  nodes  in  memory.  If  it  found  a match,  it  would 
increase  the  base-level  activation  of  all  occurrences  of  that  node  and,  to  a lesser  degree, 
the  associated  nodes.  For  example,  if  a new  problem  is  entered  that  adds  “apple.2”,  all 
instances  of  “apple.x”  gain  higher  base-level  activations.  If,  however,  a new  problem  is 
entered  that  adds  “banana.  1”,  the  base-level  activation  of  “apple.x”  is  not  strengthened 


20 

but  the  base-level  activation  of  “fruit”  is.  To  avoid  an  infinite  loop  of  spreading 
activation,  the  mechanism  could  make  a simplifying  assumption  that  activation  spreads 
over  only  one  link.  In  this  way,  the  mechanism  would  match  patterns  and  associations 
and  change  activation  strengths. 

Another  assumption  implemented  in  the  mechanism  concerns  the  strengthening  of 
activation  levels.  According  to  the  theory,  the  perceptual  objects  in  a problem  are 
strengthened  faster  than  abstract  associations.  When  a particular  class  of  object  is 
encountered  on  subsequent  training  examples,  the  activation  levels  of  all  instances  of  that 
object  class  could  be  increased  by  2.  When  the  same  happens  for  an  abstract  relation,  the 
increase  could  be  by  1 . Using  the  previous  example,  every  occurrence  of  “apple”  in  a 
problem  strengthens  the  base-level  activation  of  “apple.x”  by  2.  The  activation  passed  to 
“fruit”  is  1. 

Over  a set  of  problems  where  5 problems  were  solved  by  the  same  principle  and 
each  involved  apples,  the  base-level  activation  of  “apples.x”  would  be  10.  The  base-level 
activation  of  the  associated  semantic  category  “fruit”  would  be  5.  When  a new  problem 
is  encoded  as  a nodal  network,  the  higher  base-level  activation  that  “apples.x”  has  over 
“fruit”  dictates  that  “apples.x”  will  be  checked  for  a match  in  the  new  problem  before 
“fruit”.  If  a match  is  found  to  “apples.x”,  the  principle  that  was  associated  with  these 
problems  is  retrieved  as  a potential  solution  to  map  into  the  new  problem. 

Over  a set  of  5 problems  that  were  solved  by  the  same  principle  and  involved 
different  kinds  of  fruit,  the  base-level  activation  of  “apple.x”,  “banana.x”,  etc.  would  be 
2.  The  base-level  activation  of  the  semantic  category  “fruit”  would  still  be  5.  From  this. 


21 


a prediction  is  made  the  semantic  category  of  “fruit”  would  be  matched  against  a new 
problem  first  and  would  cue  the  retrieval  of  the  associated  principle  if  a match  was  found. 
Rather  than  association  to  “fruit”  the  problem  could  be  instances  of  Newton’s  Second 
Law. 

This  very  simple  implementation  of  the  theory  could  represent  simple  situations  in 
a semantic  network  and  adjust  base-level  activations  according  to  the  situations  that  were 
entered  into  it.  At  any  given  time,  the  memory  in  the  model  could  be  checked  to  see  what 
elements  and  patterns  had  the  strongest  base-level  activations.  This  captures  the  core 
concept  of  the  theory. 

These  simulated  base-level  activations  predict  what  base  examples  will  be  used  to 
provide  missing  information  or  solution  to  a novel  problem.  As  stated  earlier,  the  intent 
of  this  model  is  not  to  provide  a complete  account  of  the  transfer  process  but  to  offer  an 
explanation  for  what  base  examples  are  retrieved  at  the  initial  stages  of  transfer  and  how 
practice  influences  this  selection. 

The  following  set  of  experiments  examines  the  relationship  between  contextual 
variability  and  transfer.  The  first  experiment  attempts  to  validate  the  role  of  practice  in 


different  contexts  on  transfer. 


EXPERIMENT  1 


The  goal  of  the  first  experiment  was  to  see  if  contextual  variability  in  practice 
improved  the  transfer  of  problem  solving  skills.  Also  of  interest,  was  whether  the  results 
observed  in  some  studies  of  transfer  and  expertise  could  be  replicated  in  a study  of 
transfer  and  contextual  variability  in  practice.  It  is  important  to  isolate  the  critical  factors 
that  differentiate  experts  from  novices  in  transfer  to  understand  how  the  transfer  of  skill 
develops. 

There  were  several  predictions  for  the  transfer  task  summarized  in  Table  2.  First, 
individuals  given  training  over  a set  of  problems  in  different  contexts  should  attempt 
transfer  strategies  based  on  the  principles  in  the  problem.  This  would  be  indicated  by 
positive  transfer  when  the  target  problem  contained  the  same  embedded  principle  as  a 
learned  base  problem.  Individuals  given  the  same  amount  of  training  but  with  a limited 
set  of  problems  should  attempt  strategies  related  to  the  surface  features  of  problems.  This 
would  be  indicated  by  positive  transfer  when  the  target  problem  shares  similar  surface 
and  embedded  principles.  However,  if  the  target  problem  contained  similar  surface 
features  as  a known  problem  but  required  a different  principle,  negative  transfer  should 
occur.  It  is  expected  that  the  group  given  variable  practice  should  exhibit  better  overall 
transfer  indicated  by  their  accuracy  in  solving  transfer  problems. 


22 


23 


Method 

Participants 

The  participants  in  this  experiment  were  solicited  from  the  student  population  by 
advertising  in  the  student  paper  and  flyers  posted  around  the  campus.  Sixty-four  students 
from  the  University  of  Florida  were  recruited  and  paid  $10  for  their  participation.  For 
motivation,  there  was  a $30  bonus  in  each  experimental  group  for  the  participant  who 
solved  the  most  number  of  problems  correctly.  Participants  ranged  in  age  from  18  to  51 
with  a modal  age  of  21.  Of  these  64  participants,  37  were  female  and  27  were  male.  The 
distribution  of  males  and  females  between  the  experimental  groups  was  approximately 
even. 

Apparatus  and  Stimuli 

The  experiments  were  conducted  in  a quiet  computer  lab  on  PC  compatible 
computers  with  1 5”  monitors.  A mouse  and  standard  keyboard  were  used  for  input.  A 
computer  program  was  developed  to  administer  the  experiment  and  collect  data. 

The  training  and  testing  phase  of  all  experiments  in  this  study  involved  two 
physical  principles.  These  two  principles  were  originally  borrowed  from  the  mechanics 
of  respiratory  systems.  Because  these  principles  obey  basic  laws  of  physics,  they  can  be 
appropriately  applied  to  a range  of  analogous  contexts  such  as  electrical  circuits, 
hydraulic  systems,  and  other  pneumatic  systems.  The  first  principle  related  the  flow  of 
air  through  a tube  to  the  difference  in  pressure  across  the  tube  divided  by  the  resistance  to 
the  flow.  This  can  be  stated; 


Y = (X1  -X2)/Z 


(1) 


24 

The  second  principle  related  the  flow  of  air  into  or  out  of  a compliant  structure 
such  as  a lung  to  the  change  in  pressure  within  the  structure  multiplied  by  the  compliance 
of  the  structure.  This  can  be  stated: 

Y = C (dP/dt)  (2) 

These  two  principles  were  chosen  because  of  their  simplicity  and  widespread 
application.  Formal  principles  were  chosen  over  procedural  principles  such  as  the 
operation  of  a water  seal  as  a one-way  valve  because  the  formal  principles  have  perfect 
analogs  in  different  domains  whereas  the  procedural  principles  may  only  have  loosely 
related  analogs. 

The  contexts  that  the  participants  learned  and  practiced  these  principles  in 
included  a respiratory  system  and  electrical  circuit.  Examples  of  the  contexts  used  in  the 
training  and  testing  phases  are  shown  in  Figure  1 . These  contexts  were  chosen  and 
developed  in  attempt  to  maximize  the  difference  between  the  surface  features  in  each 
context.  The  attempt  was  made  to  make  each  context  visually  unique.  This  included  the 
terminology  used  within  each  context  to  refer  to  variables  and  features. 

Procedure 

Training 

Before  participating  in  the  study,  potential  subjects  were  screened  for  prior 
familiarity  with  the  principles  and  basic  algebraic  ability.  The  goal  of  this  screening  was 
to  remove  participants  who  already  knew  the  principles  or  did  not  have  the  mathematical 
skills  to  complete  the  problems.  Prior  familiarity  was  screened  by  giving  each  potential 
subject  example  questions  similar  to  those  in  the  study.  They  were  asked  if  they  had  any 


25 


prior  experience  with  solving  problems  similar  to  these.  If  any  candidate  responded  that 
they  had  previous  experience,  they  were  dropped  from  the  study. 

After  the  screening  for  familiarity,  potential  subjects  were  given  10  algebra 
problems  to  solve  involving  operations  similar  to  the  problems  in  the  study.  If  a 
candidate  was  unable  to  solve  more  than  70%  of  the  algebra  problems,  they  were  dropped 
from  the  study. 

In  the  training  phase,  all  participants  were  shown  the  two  principles  and  practiced 
solving  30  problems  with  the  principles,  15  problems  for  each  principle.  The  difference 
between  the  groups  during  training  was  the  manner  in  which  the  groups  practiced  using 
the  principles.  A summary  of  the  training  groups  is  shown  in  Table  1.  The  Consistent 
Practice  (CP)  group  received  practice  where  each  principle  was  in  a unique  context.  The 
purpose  of  this  repeated  paring  of  the  principle  and  the  context  was  to  foster  the 
association  of  the  principle  and  the  context. 

One  methodological  issue  was  addressed  concerning  the  CP  group.  Since  the  CP 
group  received  practice  with  one  principle  in  a specific  context,  it  could  happen  by  the 
nature  of  the  principles  and  contexts  chosen  that  a particular  pairing  was  easier  to  learn 
than  some  other  pairing.  To  reduce  this  potential  confound  and  increase  the  external 
validity  of  the  condition,  the  CP  group  was  split  into  two  groups  (CPa,  CPb)  and 
counterbalanced.  Here,  CPa  practiced  Principle  1 (difference  rule)  in  Context  A 
(electrical  circuit)  and  Principle  2 (compliance  rule)  in  Context  B (respiratory  system). 
CPb  practiced  Principle  1 in  Context  B and  Principle  2 in  Context  A.  The  groups 
retained  the  same  theoretical  properties;  however,  their  training  led  to  unique  predictions 
in  the  experimental  conditions. 


26 


The  Variable  Practice  (VP)  group  received  practice  where  each  principle  was 
applied  in  a variety  of  contexts.  Principle  1 was  practiced  in  Contexts  A and  B. 

Principle  2 was  also  practiced  in  Contexts  A and  B.  The  purpose  of  varying  the  context 
with  which  each  principle  was  paired  was  to  decrease  the  association  of  the  principle  to  a 
specific  context. 

The  procedure  for  the  training  of  each  principle  was  as  follows.  First,  each 
principle  was  shown  situated  in  an  example  problem.  For  example,  Principle  1 was 
demonstrated  as  it  relates  to  solving  for  the  current  across  a voltage  differential  over  a 
resistor  in  an  electrical  circuit.  Then,  Principle  2 was  shown  as  it  relates  to  solving  for 
the  flow  of  air  through  a lung  given  the  lung’s  compliance  and  change  in  pressure.  After 
the  principles  were  introduced  and  demonstrated  by  an  example  problem,  the  participants 
solved  a random  set  of  practice  problems  involving  these  principles  and  algebraic 
variations  of  them. 

To  ensure  that  all  participants  received  the  same  amount  of  practice,  each 
participant  solved  1 5 practice  problems  for  each  principle.  There  were  two  randomly 
ordered  versions  of  the  training  and  testing  problems  made  for  the  VP  group  (VPa  and 
VPb).  Participants  in  the  VP  group  were  trained  and  tested  with  one  of  the  two  versions. 
For  each  CP  group,  there  was  only  one  randomly  ordered  version  of  presentation  for  the 
training  and  testing  problems.  Feedback  was  given  after  each  answer  to  indicate  the 
accuracy  of  their  response.  To  ensure  that  all  participants  received  the  same  amount  of 
corrective  feedback  during  practice,  feedback  always  included  the  principle  used  and  the 


correct  solution  to  the  problem. 


27 


Participants  were  expected  to  learn  the  principles  through  practice  and  corrective 
feedback.  It  was  not  likely  they  could  learn  the  principles  from  a single  demonstration 
example.  These  example  problems  were  meant  to  introduce  the  principle  only.  Once  the 
participant  solved  all  30  problems  in  the  practice  set,  they  proceeded  to  the  transfer  task. 
If  a participant  failed  to  achieve  70%  accuracy  over  the  training,  they  were  dropped  from 
the  study  for  failing  to  learn  the  rules. 

Testing 

After  participants  completed  the  required  number  of  practice  problems,  they 
moved  on  to  the  transfer  tasks.  The  questions  in  this  task  were  derived  from  a factorial 
combination  of  two  levels  of  surface  feature  (context)  with  two  levels  of  structural 
relation  (principle).  The  two  levels  for  each  variable  were  same  and  different.  This 
resulted  in  four  conditions  in  the  transfer  task.  Table  2 shows  the  four  conditions  and  the 
properties  of  the  transfer  questions  given  to  each  group  as  well  as  the  predictions  for 
group  accuracy  for  each  condition. 

Each  training  group  received  the  four  transfer  problems  in  a randomized  set. 
Response  times  were  measured  from  the  presentation  of  the  problem  to  first  keystroke 
they  entered  for  their  solution.  Participants  were  informed  that  response  times  would  be 
measured  but  instructed  to  focus  on  accuracy. 

The  purpose  of  the  transfer  task  was  to  manipulate  the  surface  level  similarity  and 
abstract  similarity  of  problems  in  a controlled  way  and  evaluate  their  particular  effects  on 
solution  accuracy  for  each  group.  In  this  task,  participants  were  presented  with  a graphic 
illustration  of  a problem  and  asked  to  type  in  the  value  of  the  missing  variable.  They 
were  also  asked  to  provide  a brief  description  of  what  they  thought  as  they  attempted  to 


28 


solve  the  problem.  Participants  gave  these  retrospective  protocols  by  typing  them  into  a 
field  in  the  computer  program  that  administered  the  experiment.  The  protocol  data  was 
not  intended  to  provide  formal  data,  but  rather,  possible  insight  into  the  strategies  used 
by  participants  to  solve  the  problems. 

The  first  condition  (same  features  / same  principle)  was  referred  to  as  the  Learned 
problem  (Figure  2).  In  the  Learned  problem,  participants  were  presented  with  a transfer 
problem  that  involved  a principle  and  context  that  they  had  seen  in  association  before. 
Participants  received  a problem  of  this  type  during  training.  This  condition  can  be 
thought  of  as  a test  of  basic  learning  and  all  participants  were  expected  to  provide  the 
correct  solution.  In  this  condition,  the  VP  and  CPa  groups  received  the  same  question 
while  the  CPb  group  received  a question  with  the  same  factorial  properties  but  was 
tailored  to  their  training  (Figure  3).  CPb  could  not  receive  the  same  question  as  CPa 
because  they  did  not  practice  the  same  combination  of  principles  and  contexts. 

The  second  condition  (same  features  / different  principle)  was  referred  to  as  the 
Combination  problem  (Figure  4).  This  problem  presented  all  participants  with  a problem 
that  had  a familiar  context  but  whose  solution  required  a principle  they  did  not  learn 
during  practice.  There  were  several  specific  predictions  for  group  performance  in  this 
condition.  CPa  and  CPb  practiced  problems  in  this  context  but  used  different  principles. 
The  prediction  for  their  performance  was  that  they  would  provide  an  incorrect  solution  to 
the  problem  and  make  a predictable  mistake  based  on  the  principle  they  practiced  in  the 
context.  Using  the  example  in  Figure  4,  it  was  predicted  that  CPa  would  provide  an 
answer  based  on  the  principle  they  practiced  with  the  respiratory  system.  CPb  should 
provide  a different  answer  based  on  the  principle  they  practiced  with  this  system. 


29 


The  VP  group  had  not  been  exposed  to  the  new  combination  principle  and  was 
expected  to  provide  an  incorrect  answer  as  well.  The  counterbalanced  VP  groups 
practiced  both  principles  in  this  context  an  even  number  of  times.  According  the  learning 
mechanism,  they  should  be  as  likely  to  use  one  rule  as  they  were  the  other.  Thus,  there 
should  be  no  difference  between  the  VP  and  CP  groups  for  transfer  on  the  Combination 
problem  overall. 

The  third  condition  (different  features  / same  principle)  was  referred  to  as  the 
Remote  problem  (Figure  5).  The  Remote  problem  was  a classic  remote  transfer  task.  All 
participants  learned  the  principle  necessary  to  provide  the  correct  solution  but  none  of 
them  had  witnessed  the  principle  in  this  context  during  training.  In  this  condition,  VP 
was  expected  to  provide  the  appropriate  solution  while  both  CP  groups  would  answer 
incorrectly. 

The  fourth  condition  (different  features  / different  principle)  was  referred  to  as  the 
New  problem  (Figure  6).  This  problem  rounded  out  the  factorial  design.  All  groups  were 
predicted  to  provide  incorrect  responses  to  this  question.  No  exact  predictions  were  made 
about  the  nature  of  the  responses  the  groups  would  make  but  it  could  be  important  if  a 
particular  incorrect  answer  occurred  more  frequently  than  expected  within  and  between 
the  groups. 

Results 

The  predominant  question  in  the  transfer  task  was  whether  the  contextual 
variation  in  practice  improved  the  likelihood  of  transfer.  The  results  of  first  experiment 
compared  and  contrasted  the  performance  of  the  CP  and  VP  groups  over  the  training  and 


30 


testing  phases  of  the  experiment.  Of  interest  were  the  differences  in  accuracy  and 
response  times  for  the  groups. 

Training 

Two  participants  from  the  VP  group  and  one  participant  in  the  CP  group  were 
dropped  from  the  study  for  failing  to  achieve  the  70%  accuracy  level  in  the  training  phase 
and  not  included  in  the  data  analysis.  A summary  of  within  and  between  group 
comparisons  over  the  training  is  given  in  Table  3. 

Accuracy  and  response  times  (RTs)  were  first  compared  between  the 
counterbalanced  groups  within  CP  and  VP  for  the  training  data.  An  alpha  level  of  0.05 
was  used  for  all  statistical  tests.  There  was  no  difference  between  CPa  and  CPb  (t  (29)  = 
2.05,  p > 0.05)  or  VPa  and  VPb  (t  (28)  - 2.05,  p > 0.05)  for  average  number  of  training 
problems  solved  correctly.  Additionally,  Mann- Whitney  U-tests  revealed  no  difference 
between  response  times  for  the  CPa  and  CPb  group  (t/  (16, 15)  = 70,  p > 0.05)  or  the  VPa 
and  VPb  group  (f/(13, 17)  = 101,  p > 0.05).  Therefore,  the  accuracy  and  RT  data  was 
combined  within  each  training  group. 

The  average  number  of  training  problems  solved  correctly  by  the  CP  group  was 
27.16  {SD  = 3.03)  out  of  30  problems.  The  average  number  of  training  problems  solved 
correctly  by  the  VP  group  was  23.57  {SD  — 3.81).  The  CP  group  solved  more  training 

problems  correctly,  {t  (59)  = 1.67,  p < 0.001). 

The  response  time  data  showed  similar  results.  The  mean  completion  time  to 
solve  all  30  training  problems  for  the  CP  group  was  9.94  {SD  = 2.84)  minutes  and  13.02 
{SD  = 5.45)  minutes  for  the  VP  group.  The  CP  group  completed  the  training  faster  than 
the  VP  group,  (t/(31,  30)  = 302,  p < 0.05). 


31 


Testing 

Following  the  training  phase,  participants  who  fulfilled  the  accuracy  criteria 
proceeded  to  the  transfer  task.  There  were  four  problems  in  the  transfer  task  having  a 2x2 
combination  of  features  and  principles:  same  features  / same  principle  (Learned  problem), 
same  features  / different  principle  (Combination  problem),  different  features  / same 
principle  (Remote  problem),  and  different  features  / different  principle  (New  problem). 
The  accuracy  data  and  statistical  comparison  for  each  group  on  each  transfer  problem  is 
summarized  in  Table  4 as  proportions  of  the  total  group. 

Learned  problem.  For  the  same  features  / same  principle  problem  illustrated  in 
Figures  2 and  3,  both  groups  perform  well.  The  proportion  of  CP  that  gave  the  correct 
response  (1 .0)  was  not  different  from  the  proportion  of  VP  that  gave  the  correct  response 
(0.93),  (x^l,V=59)  = 0.15,p>0.05). 

It  is  important  to  restate  that  CPa  and  VP  groups  received  the  problem  shown  in 
Figure  2.  The  CPb  group  received  the  problem  shown  in  Figure  3 because  it  met  the 
same  feature  / same  principle  requirement  based  on  the  pairing  of  surface  feature  and 
principle  they  practiced  during  training.  There  was  no  difference  in  the  proportion  of  CPa 
solving  the  problem  in  Figure  2 correctly  and  the  proportion  of  CPb  that  solved  Figure  3 
correctly.  The  counterbalanced  groups  within  CP  were  combined  for  comparison  with 
the  VP  group. 

Combination  problem.  For  the  same  feature  / different  principle  problem 
illustrated  in  Figure  4,  no  CP  members  gave  the  correct  answer.  However,  the  proportion 
of  CP  members  giving  the  predicted  incorrect  response  was  0.90. 


32 

The  responses  from  the  VP  group  showed  a wider  degree  of  variation.  Most 
members  (0.70)  used  one  rule  or  the  other  as  predicted.  Approximately  0.33  of  the 
members  of  the  VP  group  used  the  difference  rule  and  responded  with  an  answer  of  3. 
Approximately  0.37  responded  with  an  answer  of  30  suggesting  they  used  the  compliance 
rule.  In  a few  cases  (0.13),  participants  in  the  VP  group  correctly  combined  both 
principles  and  accurately  solved  the  problem.  However,  there  was  no  difference  in 
accuracy  between  the  CP  and  VP  groups  (1,7/=  57)  = 0.44,  p > 0.05). 

Remote  problem.  The  classic  test  for  transfer  of  problem  solving  was  the 
different  feature  / same  principle  problem  illustrated  in  Figure  5.  The  proportion  of 
participants  in  the  CP  group  giving  the  correct  answer  was  0.09  compared  to  0.67  for  the 
VP  group.  The  difference  between  these  two  groups  was  highly  significant,  (x  (1 , = 

23)  = 12.56,  p < 0.001).  This  effect  shows  the  key  difference  between  the  CP  and  VP 
groups. 

New  problem.  On  the  different  features  / different  principle  problem  illustrated  in 
Figure  6,  the  CP  and  VP  groups  both  exhibited  poor  performance  with  no  members 
providing  a correct  response.  There  was  no  discernible  pattern  in  the  response  from 
either  group  that  would  suggest  a common  strategy. 

An  analysis  of  the  total  completion  time  (in  minutes)  for  the  four  questions 
showed  no  differences  between  the  CP  or  VP  groups  (M=  4.58,  SD  = 2.99;  M=  4.95,  SD 
- 4.35,  respectively,  U (31,  30)  = 449,  p > 0.05). 

Discussion 

At  a first  glance,  the  CP  group  seems  to  outperform  the  VP  group  during  training. 
The  CP  group  learned  the  rules  significantly  faster  and  achieved  higher  accuracy  scores 


33 


during  training.  However,  the  CP  group  did  not  perform  as  well  as  the  VP  group  on  the 
transfer  tasks  and  there  was  no  difference  in  response  time  between  the  two  groups  on 
any  transfer  problem.  The  primary  question  in  this  study  was  whether  or  not  the 
contextual  variability  in  practice  would  facilitate  the  transfer  of  problem  solving  skill. 

The  results  indicate  this  is  clearly  the  case.  This  effect  has  been  demonstrated  in  motor 
tasks  (Catalano  & Kleiner,  1984;  Kerr  & Booth,  1978)  and  verbal  tasks  (Bransford, 
Franks,  Morris,  & Stein,  1979)  but  not  in  a problem  solving  task  such  as  this. 

The  exemplary  performance  of  both  groups  on  the  learning  problem  confirmed 
that  they  did  learn  the  rules.  The  performance  of  the  CP  group  on  the  Combination 
problem  (same  features  / different  principle)  suggests  that  they  recognized  the  problem 
and  predictably  applied  the  solution  procedure  they  had  associated  with  that  context 
during  training.  For  the  VP  group,  it  was  predicted  that  they  could  recognize  both 
principles  in  the  problem  and  would  apply  one  principle  or  the  other.  In  a few  of  the  VP 
cases,  participants  correctly  combined  the  rules. 

The  Remote  transfer  problem  showed  the  predicted  difference  between  the  two 
groups.  Approximately  70%  of  the  VP  group  accurately  solved  the  remote  transfer 
problem  while  very  few  from  the  CP  groups  solved  it.  This  suggests  that  the  variation  in 
practice  context  does,  in  fact,  foster  the  transfer  of  problem  solving  skills  to  new 
domains. 

The  degree  of  transfer  obtained  by  the  VP  group  was  less  than  expected  however. 
For  this,  there  could  be  several  reasons.  One  factor  that  may  have  limited  the  amount  of 
transfer  obtained  by  this  group  was  the  juxtaposition  of  two  contexts.  This  may  not  be 
enough.  In  other  studies  of  transfer,  three  contexts  have  been  required  to  foster 


34 


spontaneous  transfer  without  a hint  (Gick  & Holyoak,  1980;  Gick  & Holyoak,  1983).  It  is 
possible  that,  in  some  cases,  two  contexts  were  enough  to  raise  suspicions  that  the 
problems  shared  similarities  at  structural  or  semantic  level  but  a third  could  have 
confirmed  these  suspicions. 

The  overall  prediction  for  the  Remote  transfer  problem  was  that  the  VP  would  be 
significantly  more  accurate  at  solving  this  problem  than  the  CP  group.  This  prediction 
was  supported.  It  is  clear  that  contextual  variation  in  practice  improved  transfer  but  why? 
Experiment  2 was  concerned  with  the  underlying  processes  that  gave  rise  to  the 
differences  between  the  groups. 


Electrical  Circuit 


-vV\M<] — 


Voltage  Change  = E 


Elasticity  = D 


Resistance  = C < Current  = B 


Voltage  = A 


Figure  1 

Examples  of  two  different  but  analogous  contexts 


36 


Table  1 

Practice  Conditions  for  Each  Experimental  Group  in  Experiment  1 and  Experiment  2* 


Consistent  Practice  Group  (n-3 1 ) 

Variable  Practice  Group  (n=30) 

CPa  (n=15) 

CPb  (n=16) 

Principle  1 in  Context  A 
(15  trials) 

Principle  2 in  Context  B 
(15  trials) 

Principle  1 in  Context  B 
(15  trials) 

Principle  2 in  Context  A 
(15  trials) 

Principle  1 in  Context  A and  B 
(15  trials) 

Principle  2 in  Context  A and  B 
(15  trials) 

“ Practice  trials  were  presented 

in  a random  order.  See  text  for  details. 

f 


37 


Table  2 

Conditions  in  Transfer  Task  and  Prediction  for  Transfer  Accuracy  for  Each  Group  in  Experiment  1 


Transfer  Condition 

Group  Accuracy 

Name 

Surface 

Features 

Structural 

Relations 

Transfer  Task 

Consistent 

Practice 

(CP) 

Variable 

Practice 

(VP) 

Learned 

Same 

Same 

Principle  1 in  Context  A 
CPb:  Principle  1 in  Context  EP 

% Correct 

= % Correct 

Combination 

Same 

Different 

Principle  3*’in  Context  B 

% Correct ' 

= % Correct 

Remote 

Different 

Same 

Principle  2 in  Context  D** 

% Correct 

< % Correct 

New 

Different 

Different 

Principle  4**  in  Context  D*" 

% Correct 

= % Correct 

‘ CPb  received  a different  problem  than  CPa  and  VP  due  to  counterbalanced  training.  See  text  for  details. 
*’  Principle  or  Context  not  shown  before  during  training. 

' In  this  condition,  both  CP  groups  should  give  a predicted  incorrect  response.  See  text  for  details. 


38 


Table  3 

Summary  and  Comparison  of  Performance  Within  and  Between  Groups  Over  Training  in  Experiment  1 


Training  Condition 

Number  Correct 

Completion  Time  (min) 

M(SD) 

M(SD) 

CPa 

28.81  (3.76) 

8.3  (2.41) 

CPb 

26.40  (2.10) 

11.33  (3.21) 

VPa 

24.04  (4.01) 

13.60  (6.91) 

VPb 

23.13(3.61) 

* 

12.59  (4.20) 

* 

CP  total 

27.16(3.03) 

9.94  (2.84) 

VP  total 

23.57  (3.81) 

13.02  (5.45) 

t(59)=  1.67,  p<  0.001 

(7(31,  30)  = 302,  p<  0.05 

* No  statistical  difference  between  groups 


39 


Table  4 


Summary  and  Comparison  of  Accuracy  Within  and  Between  Groups  Over  the  Transfer  Tasks  in  Experiment 
1 


Training 

Condition 

Transfer  Condition 
(Proportion  of  group) 

Completion  Time 
(min) 

Learned 

Combination 
(as  prediclecf) 

Remote 

New 

M(SD) 

CPa 

1.00 

1.00 

0.06 

0.00 

4.68  (2.11) 

CPb 

1.00 

0.79 

0.14 

0.00 

4.45  (3.84) 

* 

* 

* 

♦ 

* 

VPa 

1.00 

0.67 

0.62 

0.00 

6.10(6.12) 

VPb 

0.88 

0.73 

0.71 

0.00 

4.08  (2.10) 

* 

* 

* 

* 

* 

CP  total 

1.00 

0.90 

0.09 

0.00 

4.58  (2.99) 

VP  total 

0.93 

0.70 

0.67 

0.00 

4.96  (4.36) 

* 

x'(l,/^=23)=  12.56, 

* 

* 

p<  0.001 

“ For  the  Combination  problem,  the  comparison  of  predicted  responses  was  favored  over  true  accuracy.  See 
text  for  details 

* No  statistical  difference  between  groups 


40 


• — vAAMO — 


Change  Voltage  = 


Elasticity  = 7 


Voltage  Y = 8 


Resistor  Current  =? 


Resistance  = 3 


Voltage  X * -1 0 


5 


Figure  2 

The  Learned  problem  given  to  CPa  and  VP  coupled  a known  principle  (difference  rule) 
with  the  surface  features  (electrical  circuit)  in  which  it  was  learned 


41 


Vollage  Y = 


Resistance  = 3 

8* — /W'— K] — " 

Capacitor  Current  =? 


Voltage  X 


Elasticity  = 2 
Change  Voltage  * 


-10 


-3 


Figure  3 

The  Learned  problem  given  to  CPb  coupled  a known  principle  (compliance  rule)  with  the 
surface  features  (electrical  circuit)  in  which  it  was  learned 


42 


Figure  4 

The  Combination  problem  in  the  transfer  task  required  a new  principle  (the  combination 
of  both  rules)  applied  to  a familiar  context  (respiratory  system) 


43 


Flexibility  in  T ank  2 = 8 


Shift  in  pull  from  T ank  2 =? 


Shift  in  Pull  from  T ank  1 = -5 


Drift  rate  = 48 


Frictional  Opposition  of  pipe  = 100 


Figure  5 

The  Remote  problem  in  the  transfer  task  required  a learned  principle  (compliance  rule) 
applied  to  an  unfamiliar  context  (water  treatment  system) 


44 


Shift  in  Pull  from  T ank  2 = -4 


Pull  (2) » 0 

By  what  exact  amount  does  the  pipe  facilitate  the  passage  of  sewage  =? 


Figure  6 

The  New  problem  in  the  transfer  task  required  an  unknown  principle  (conductance  rule) 
applied  to  an  unfamiliar  context  (water  treatment  system) 


EXPERIMENT  2 


The  transfer  task  measured  accuracy  on  questions  that  differed  in  their  surface 
features  and  structural  relations.  Nothing  has  been  said  up  to  this  point  about  how 
participants  were  believed  to  perform  these  tasks  and  what  processes  were  responsible  for 
the  characteristic  performance  of  each  group.  The  matching  task  was  constructed  to  test 
predictions  based  on  processes  that  were  presumed  to  account  for  the  pattern  of  results  in 
the  transfer  task. 

A learning  mechanism  was  described  earlier  that  provided  an  explanation  for  how 
contextual  variability  could  improve  transfer.  At  the  core  of  the  theory  is  the  belief  that 
varying  the  surface  level  information  strengthens  the  activation  of  higher  level 
information  (semantic  knowledge  in  this  case)  in  the  solver’s  mental  representation  of  the 
problem.  Similarity  between  problems  at  a semantic/structural  level  is  more  likely  to 
indicate  that  a solution  procedure  can  be  transferred  across  problems  that  appear 
physically  different. 

Over  the  training  the  VP  group  received,  categorizing  problems  based  on  features 
led  to  failed  solution  attempts.  The  information  in  the  problems  that  indicated  which 
principle  to  use  was  the  structural  information.  For  example,  the  critical  pieces  of 
semantic  information  for  the  difference  principle  (Y  = (XI  - X2)  / Z)  were  “flow”, 
“difference  between  quantities”,  and  “resistance”.  The  critical  piece  of  causal 


45 


46 


information  was  that  the  difference  occurred  across  the  resistance.  Together,  these  were 
the  structural  elements  that  would  classify  the  problem  as  a “difference”  problem. 

During  training,  the  terms  in  the  problem  are  encoded  and  pass  activation  along  to 
their  semantic  categories.  Due  to  the  contextual  variability  in  training,  the  base-level 
activation  of  the  semantic  categories  would  come  to  dominate  categorization  of  the 
problems  and  retrieval  of  associated  principles.  Therefore,  the  semantic  categories  that 
define  a “difference”  or  “compliance”  problem  are  searched  for  in  a new  problem  before 
the  physical  features  that  define  a “circuit”  or  “respiratory”  problem.  This  led  to  the 
predictions  summarized  in  Table  5. 

Stated  briefly,  the  process  offered  by  most  theories  of  transfer  suggests  that  to 
solve  these  problems,  participants  retrieve  a previously  encountered  problem  and  attempt 
to  fit  the  retrieved  solution  to  the  problem  at  hand.  The  retrieval  of  previously 
encountered  problems  is  believed  to  rely  on  the  perceived  similarity  between  the  target 
problem  and  stored  problems.  If  this  is  true,  then  a task  requiring  participants  to  match 
one  problem  to  another  based  on  their  evaluation  of  the  similarity  between  the  two  could 
reflect  the  process  involved  in  the  early  stages  of  transfer.  This  is  the  goal  of  the 
matching  task. 

It  was  believed  that  the  judgment  of  similarity  between  the  transfer  problems  and 
alternative  problems  would  depend  on  the  internal  representation  the  solver  forms  of 
each.  Differences  in  the  matching  of  similarity  between  the  training  groups  should  reflect 
differences  in  their  internal  representations  of  the  problems. 

For  the  matching  task,  there  were  several  predictions  outlined  in  Table  5.  It  was 
expected  that  the  VP  group  who  received  contextual  variability  in  practice  would  be  more 


47 


likely  to  match  transfer  problems  to  alternative  problems  that  shared  structural 
similarities.  The  CP  group  practiced  principles  within  specific  domains  and  should 
match  transfer  problems  to  alternative  problems  that  share  surface  features.  A 
comparison  of  group  performance  to  the  predictions  in  Table  5 served  as  an  evaluation  of 
the  theory. 

Method 

Participants 

The  participants  in  Experiment  2 were  the  same  participants  in  Experiment  1 . 
Apparatus  and  Stimuli 

The  materials  in  Experiment  2 involved  the  same  principles  and  contexts  as 
Experiment  1 . In  the  matching  task,  participants  were  presented  the  same  problems  they 
attempted  to  solve  during  the  transfer  task  one  at  a time.  The  alternative  problems  were 
viewed  by  scrolling  through  a window  that  only  permitted  one  problem  to  be  seen  at  a 
time.  This  was  to  minimize  the  likelihood  that  participants  would  compare  the  alternative 
problems  to  one  another  or  devise  other  strategies. 

Responses  were  made  by  entering  the  number  of  the  alternative  problem  they 
chose  as  the  most  similar.  As  with  the  transfer  task,  participants  were  asked  to  provide  a 
brief  description  of  what  they  thought  as  the  attempted  to  solve  the  problem.  They  gave 
these  retrospective  protocols  by  typing  them  into  a field  in  the  computer  program  that 
administered  the  experiment.  The  protocol  data  was  not  intended  to  provide  formal  data 
but  possible  insight  into  the  strategies  used  by  participants  to  solve  the  problems.  No 
feedback  was  given.  Table  5 gives  a description  of  the  conditions  in  the  matching  task 
and  the  predictions  made  according  to  the  learning  mechanism. 


48 


It  is  important  to  define  what  similarity  meant  in  the  matching  task.  Participants 
were  asked  to  match  the  target  problem  to  an  alternative  problem  that  seemed  to  have  the 
most  similar  solution  procedure.  This  was  intended  to  focus  them  on  selecting  a problem 
that  would  be  useful  for  a transfer  solution. 

Procedure 

After  completing  the  transfer  task,  participants  were  given  a short  break  then 
preceded  to  the  matching  task.  Each  training  group  again  received  the  four  transfer 
problems  in  a randomized  set.  Response  times  were  measured  from  the  presentation  of 
the  problem  to  first  keystroke  they  entered  for  their  solution.  Participants  were  informed 
that  response  times  would  be  measured  but  instructed  to  focus  on  accuracy. 

Participants  were  shown  the  original  transfer  problem  and  asked  to  select  one 
alternative  problem  from  a set  of  six  that  seemed  to  be  the  most  similar  to  the  transfer 
problem.  It  was  presumed  that  the  participant’s  selection  was  made  based  on  matches  to 
the  features  in  the  target  problem  that  they  perceived  as  important  to  the  solution.  It  was 
also  presumed  that  these  features  were  the  basis  for  categorizing  and  retrieving  problems 
from  memory  during  the  transfer  task. 

Results 

The  predictions  for  the  CP  group  were  that  they  would  match  the  original  transfer 
problem  to  an  alternative  problem  having  similar  surface  features  regardless  of  the 
principle  the  alternative  problem  required.  The  VP  group,  by  contrast,  was  predicted  to 
match  the  transfer  problems  to  alternative  problems  requiring  the  same  principle 
regardless  of  the  surface  features  of  the  problem. 


49 


No  comparison  between  the  counterbalanced  groups  within  a training  group 
resulted  in  significant  differences  on  any  of  the  four  matching  task  problems  or  for  the 
total  completion  time  of  the  tasks.  Therefore,  the  data  within  each  training  group  were 
aggregated.  A summary  of  results  from  the  within  group  tests  is  given  in  Table  6.  The 
results  for  each  group  on  each  of  the  four  matching  problems  is  reported  below  and 
summarized  in  Table  6. 

For  each  matching  task,  two  proportions  were  compared;  the  proportion  of  the 
group  selecting  the  alternative  problem  with  the  same  surface  features  and  the  proportion 
of  the  group  selecting  the  alternative  problem  requiring  the  same  principle. 

Matching  to  Learned  problem.  The  original  Learned  problem  (Figure  2)  involved 
the  difference  rule  (compliance  rule  for  CPb,  Figure  3)  in  the  context  of  an  electrical 
circuit.  The  prediction  for  matching  was  that  the  CP  group  would  be  more  likely  to 
match  this  problem  with  the  problem  from  the  alternative  set  that  involved  an  electrical 
circuit  (Figure  7).  The  proportion  of  CP  that  chose  the  circuit  problem  (0.81)  was  greater 
than  the  proportion  of  VP  that  chose  the  circuit  problem  (0.27),  (x^  (1,  V=  34)  = 7.53,  p < 
0.01). 

The  prediction  for  the  VP  group  was  that  they  would  be  more  likely  to  match  this 
problem  with  the  problem  from  the  alternative  set  that  involved  the  difference  rule 
(Figure  8).  The  proportion  of  VP  that  chose  the  difference  problem  (0.53)  was  much 
larger  than  the  proportion  of  CP  that  chose  the  difference  problem  (0.06),  (x^  (1,  V=  17) 

= 13.24,  p<  0.001). 

A comparison  of  the  response  distributions  for  each  group  is  shown  in  Figure  9. 

In  this  figure,  the  difference  between  the  two  groups  in  similarity  matching  to  the  Learned 


50 

problem  is  clear.  This  figure  also  shows  a large  proportion  of  the  VP  group  (0.27)  that 
gave  the  response  predicted  for  the  CP  group. 

Matching  to  Combination  problem.  The  original  Combination  problem  (Figure  4) 
involved  a combination  of  the  difference  rule  and  the  compliance  rule  in  the  context  of  a 
respiratory  system.  A comparison  of  the  response  distributions  for  each  group  is  shown 
in  Figure  9. 

After  the  data  from  all  experiments  was  collected,  a confound  was  discovered  in 
the  Combination  matching  task.  The  alternative  problem  having  the  same  features  as  the 
Combination  problem  also  required  the  compliance  rule  for  its  solution.  Therefore,  the 
number  of  participants  who  chose  this  problem  as  a match  based  on  surface  features 
cannot  be  distinguished  from  the  number  who  chose  it  as  a match  based  on  the 
compliance  principle.  This  is  clearly  shown  in  the  predictions  for  the  Combination 
matching  in  Table  5. 

The  prediction  for  the  CP  group  was  that  they  would  choose  this  problem  as  a 
match  to  the  Combination  problem  based  on  surface  features.  There  was  no  difference 
between  CPa  and  CPb  in  selecting  this  alternative  problem  (x^  (1,  22)  = 0.73,  p > 

0.05).  CPb  was  predicted  and  shown  use  the  compliance  rule  to  solve  the  Combination 
transfer  problem.  Due  to  the  confound,  it  is  not  possible  to  deduce  which  members  of  the 
CPb  group  chose  this  alternative  problems  as  a match  based  on  surface  features  or  on  the 
principle.  In  light  of  the  fact  that  there  was  no  difference  between  the  CPa  and  CPb 
groups  on  any  of  the  transfer  and  matching  tasks  and  the  performance  of  the  entire  CP 
group  on  the  other  matching  and  transfer  tasks,  it  is  most  likely  that  it  was  a match  based 


on  surface  features. 


51 


The  performance  of  the  VP  group  is  more  difficult  to  assess  in  light  of  the 
confound.  The  prediction  of  the  learning  mechanism  was  that  VP  learned  both  principles 
and  contexts  equally  and  would  use  one  principle  or  the  other  to  solve  the  Combination 
transfer  problem.  In  the  Combination  matching  problem,  the  alternative  problem  chosen 
by  VP  members  should  correspond  to  whichever  principle  they  used  to  solve  the  transfer 
problem.  This  would  indicate  they  were  able  to  recognize  the  embedded  principle  in 
various  contexts. 

Of  the  VP  members  who  used  the  difference  principle  to  solve  the  Combination 
problem  60%  also  chose  the  alternative  problem  requiring  the  difference  rule.  Of  the  VP 
members  who  used  the  compliance  rule  to  solve  the  Combination  problem,  82%  also 
chose  the  alternative  problem  with  the  confound.  Due  to  the  confound,  this  percentage 
included  matches  based  on  surface  features,  the  compliance  principle,  and  the  combined 
influence  of  both.  Across  the  entire  VP  group,  57%  selected  the  alternative  problem  with 
the  same  features  as  the  Combination  problem.  Therefore,  further  comparisons  against 
this  data  were  omitted.  Many  of  the  interesting  comparisons  were  masked  by  the 
confound  in  this  problem. 

The  proportion  of  the  CP  group  selecting  the  alternative  problem  whose  solution 
involved  the  combination  of  both  rules  was  0.10.  The  proportion  of  the  VP  group  who 
made  this  same  match  was  0.07.  There  was  no  significant  difference  between  these 
groups  (x^  ( 1 , V = 53)  = 0. 1 7,  p > 0.05) 

Matching  to  Remote  problem.  The  original  Remote  problem  (Figure  5)  involved 
the  difference  rule  in  the  context  of  a water  treatment  system.  The  prediction  for 
matching  was  that  the  CP  group  would  be  more  likely  to  match  this  problem  with  the 


52 


problem  from  the  alternative  set  that  also  involved  a water  treatment  system  (Figure  11). 
The  proportion  of  CP  that  chose  the  water  treatment  problem  (0.74)  was  larger  than  the 
proportion  of  VP  that  chose  the  water  treatment  problem  (0.17),  (x^  (1,  = 29)  = 9.97,  p 

<0.01). 

The  prediction  for  the  VP  group  was  that  they  would  be  more  likely  to  match  this 
problem  with  the  problem  from  the  alternative  set  that  involved  the  difference  rule 
(Figure  12).  The  proportion  of  VP  that  chose  the  difference  problem  (0.57)  was  greater 
than  the  proportion  of  CP  that  chose  the  difference  problem  (0.06),  (x^  (1,  V=  19)  = 

1 1.84,  p<  0.01). 

The  difference  between  the  two  groups  in  frequency  of  response  is  clearly  seen  in 
Figure  9.  As  with  matching  to  the  Learned  problem,  there  was  a proportion  of  the  VP 
group  (0.17)  that  gave  the  response  predicted  for  the  CP  group. 

Matching  to  New  problem.  The  original  New  problem  (Figure  6)  involved  an 
unknown  rule  related  to  resistance  in  the  context  of  a water  treatment  system.  The 
prediction  for  matching  was  that  there  would  be  no  difference  between  the  two  training 
groups.  The  belief  was  that  neither  group  was  given  the  appropriate  principle  during 
training;  therefore,  both  groups  would  rely  on  the  surface  information  to  make  a match  in 
the  absence  of  knowing  information  about  the  deeper  structure.  The  proportion  of  CP 
that  chose  the  water  treatment  problem  from  the  alternative  set  (Figure  13)  was  0.73.  The 
proportion  of  VP  that  chose  the  water  treatment  problem  was  0.70.  This  difference  was 
not  significant  (x  (1 , - 43)  = 0.02,  p > 0.05).  A comparison  of  the  response 

distributions  for  each  group  is  shown  in  Figure  9. 


53 


Discussion 

Although  a confound  in  the  matching  Combination  problem  prevented  a complete 
analysis  for  that  problem,  the  results  of  the  matching  task  showed  the  predicted 
differences  between  the  CP  and  VP  groups  overall.  When  asked  to  select  a problem 
having  a similar  solution  procedure  to  the  target  problem,  the  CP  group  favored  problems 
with  matching  surface  features.  They  did  this  for  problems  that  looked  familiar  and 
problems  that  were  unfamiliar  to  those  they  had  previously  encountered. 

This  matching  behavior  corresponded  to  their  performance  at  the  transfer  task.  A 
plausible  explanation  for  the  transfer  process  is  that  features  of  the  problem  determine  its 
categorization  and  cues  the  principle  associated  with  the  category.  This  occurs  in  spite  of 
the  fact  that  the  problem  could  have  a different  goal.  In  the  matching  task,  the  CP  group 
overwhelmingly  chose  problems  that  had  the  same  surface  features  as  the  transfer 
problem  regardless  of  the  embedded  principle  involved  in  the  problems. 

As  stated  earlier,  the  matching  performance  of  the  VP  group  exhibited  more 
variability  in  response  but  was  generally  consistent  with  predictions.  Participants  in  the 
VP  group  were  more  likely  to  select  problems  solved  by  similar  principles  regardless  of 
physical  appearance  when  asked  to  match  problems  based  on  similarity  of  solution 
procedure.  This  was  evident  in  the  different  predictions  for  the  CP  and  VP  groups  on  the 
Learned  match  and  the  Remote  match.  The  Learned  problem  and  Remote  problem  were 
solved  by  familiar  rules.  When  matching  alternative  problems  to  these  two  problems,  VP 
members  were  more  likely  to  choose  problems  solved  using  the  same  principle.  This 
contrasted  with  the  results  from  the  CP  group,  which  tended  to  choose  problems  with 


similar  surface  features. 


54 

There  was  more  variability  in  the  responses  made  by  the  VP  group  on  the 
matching  task.  There  are  several  possible  explanations  for  this.  In  the  transfer  task, 
several  participants  from  the  VP  group  failed  to  solve  the  remote  transfer  problem.  The 
matching  performance  of  the  non-transferring  VP  members  was  the  same  as  that 
predicted  for  the  CP  group.  It  was  possible  that  the  variation  in  practice  did  not  foster  a 
more  abstract  categorization  of  problems  for  some  members  of  the  VP  group. 

In  addition  to  the  failure  of  the  variable  practice  manipulation  to  foster  transfer, 
protocols  given  by  participants  suggested  they  used  other  strategies  to  match  an 
alternative  problem  to  the  transfer  problem.  The  way  the  task  was  designed  allowed 
participants  to  directly  compare  the  original  problem  to  an  alternative  problem  without 
first  considering  how  they  originally  solved  it. 

From  the  protocols,  it  was  evident  that  many  participants  compare  all  the 
problems  to  the  target  problem  and  select  problems  that  have  similar  goals.  If  the  goal  of 
the  target  problem  was  to  solve  for  “current”,  some  participants  selected  an  alternative 
problem  that  required  a solution  for  “drift  rate”.  They  inferred  that  if  the  problems  had 
similar  goals,  then  perhaps  they  were  solved  in  a similar  fashion.  In  cases  where  more 
than  one  base  problem  had  a similar  goal  as  the  target  problem,  some  participants  further 
compared  the  alternatives  and  target  for  similar  variables  that  could  be  used  to  solve  the 
problem.  However,  they  seemed  not  to  care  whether  or  not  the  problems  actually 
involved  the  same  solution  procedures.  It  seemed  enough  to  find  an  alternative  problem 
that  had  the  same  goal  and  some  parameters  that  suggested  a solution  to  the  problem. 

The  various  strategies  that  participants  could  employ  during  the  matching  task  present  a 
potential  source  of  variance  in  the  data. 


55 


It  was  possible  that  the  training  emphasized  to  the  VP  group  that  the  principles 
applied  across  domains  and  there  was  no  actual  benefit  of  priming  semantic  categories  as 
the  learning  mechanism  suggested.  The  VP  group  may  have  learned  to  try  and  fit  both 
principles  into  the  problem.  If  this  were  the  case,  differences  between  these  two  groups 
in  transfer  and  matching  would  be  minimized  if  the  CP  group  were  led  to  a deliberate 
transfer  strategy.  This  issue  was  explored  in  Experiment  3. 


56 


Table  5 

Predictions  for  Training  Groups  on  Matching  Task  in  Experiment  2 


Transfer 

Condition 

Principles/Context 

New  Example  Problems 

Predicted  matches  by 
Experimental  group 

CPa 

CPb  VP 

Learned 

Principle  1 in  Context  A 
CPb:  Principle  1 in  Context  B 

Principle  2 in  Context  B 
Principle  3”  in  Context  A 
Principle  4*  in  Context  B 
Principle  1 in  Context  D“ 

X 

X 

X 

Combination 

Principle  3’  in  Context  B 

Principle  3‘  in  Context  A 
Principle  2 in  Context  B 
Principle  1 in  Context  D“ 
Principle  4 in  Context  C 

X 

x*”  x‘’ 

x“^ 

Remote 

Principle  2 in  Context  D“ 

Principle  1 in  Context  D“ 
Principle  2 in  Context  B 
Principle  3*  in  Context  A 
Principle  4'  in  Context  A 

X 

X 

X 

New 

Principle  4*  in  Context  D” 

Principle  2 in  Context  C 
Principle  4"  in  Context  A 
Principle  1 in  Context  D" 
Principle  2 in  Context  B 

X 

X X 

” Principle  or  Context  not  shown  before  during  training. 

After  all  the  data  was  collected,  a confound  was  discovered  that  limits  the  analysis  of  this  prediction.  See 
text  for  details. 

' According  to  the  learning  mechanism,  the  VP  group  should  match  the  Combination  problem  to  the 
principle  they  used  to  solve  it  during  the  transfer  task  in  Experiment  1 . This  leads  to  2 predictions  for  the 
VP  group  in  this  condition.  See  text  for  details. 


Table  6 

Summary  and  Comparison  of  Performance  Within  and  Between  Groups  for  the  Matching  Tasks  in  Experiment  2 

Training  Matching  Condition  Completion 

Condition  (Proportion  of  Group)  Time  (min) 

Learned  Combination  Remote  New  M (SD) 


57 


o 

O 

m 

ov 

m 

ov  m 

r-; 

q 

m 

d d * 

m 

<N 

* 

m — 

«r> 

00 

(N 

q 

q ^ 

q 

q 

m 

m 

m 

m 

Q. 

'u 

C 


VO  O 

q q * 
o o 


a. 


o o 
o o 


m o 
o o 
d d 


* 


U. 


* 

d d 


On  VO  mo 

q * f*«*  # 

d d d d 


VO 

o o 


VO  ON 
m * 
o o 


VO  r- 
o m 
d d 


q 

d 

V 

o. 


..  Tf 
^ 00 


w 

3 

c5 

a. 


m On 
r-  r-*  * 
o o 


m 00 
CN  — * 

o o 


d d 


o 2 

(N  O 

II  V 


O q 
d 
II 


VO 

o o 


r*- 
o o 
d d 


o r** 

— q * 
d o 


Ov 

VO  VO 


Tf  m 


q q 

d d 


A 


E 

O 


Q- 

*o 


VO  O 
O O * 

d d 


cu 


m Tt 


o 

q 

d 


VO  m 
q q 

d d 


II  V 

“■ 


CQ 

0> 

U. 


— VO 

q q « 

d d 


O Ov 
m <N 


00  CM 

d d 


cd 

X) 

ed  ^ 

cd 

O 

cd 

O 

Ou 

a. 

CL  CL 

0. 

CL 

u 

U 

> > 

U 

> 

cd 

-4i^ 

u 

-o 

,o 


X 

4> 

4> 

C/2 


00 

c 

00  -H 

1 i 

11 

00  'S 

c c 

1 I 
1 o 
I ^ 

5 K 


■o  u o 

S I 
S « 8 

O -S  u 
Z = fS 


58 


2) 


6 


Voltage  Change  = 10 


Elasticity  = 1 0 


Voltage X = 0, Voltage  Y = -4 

Resistance  = 2 


Capacitor  Current  = ? 


Resistance  = 2 


Figure  7 

CP  was  predicted  to  select  this  problem  as  a match  to  the  Learned  problem  based  on 
surface  features 


59 


4) 


Tank  2 flexibility  = 5 


Pull  (1)  = 0 


Figure  8 

VP  was  predicted  to  select  this  problem  as  a match  to  the  Learned  problem  based  on  the 
embedded  principle 


Number  of  Observations 


60 


Similarity  Match  Selection 
(F  = feature,  P = principle,  d = distracter) 


Figure  9 

Frequency  distribution  of  similarity  matches  for  CP  and  VP  in  Experiment  2 
® In  the  Combination  problem,  the  alternative  problem  \vith  matching  features  also 
required  the  compliance  principle 
This  problem  required  the  difference  principle 
' This  problem  required  the  actual  combination  principle 


61 


Figure  10 

Both  CP  counterbalanced  groups  were  predicted  to  select  this  problem  as  a match  to  the 
Combination  problem  based  on  surface  features.  However,  this  problem  can  also  be 
solved  by  the  compliance  principle.  Therefore,  participants  in  the  CPb  and  VP  groups 
who  used  the  compliance  principle  to  solve  the  Combination  problem  may  have  correctly 
chosen  this  problem  as  having  the  same  solution  procedure.  See  text  for  details. 


Flexibility  in  Tank  2 = 8 
Drift  rate  = ? 


Figure  11 

CP  was  predicted  to  select  this  problem  as  a match  to  the  Remote  problem  based 
surface  features 


63 


Figure  12 

VP  was  predicted  to  select  this  problem  as  a match  to  the  Remote  problem  based  on  the 
embedded  principle 


64 


Shift  in  Pull  from  Tank  2 = 0 


Tank  2 flexibility  = 11 


Pull  (2)  = 0 


Figure  13 

Both  CP  and  VP  were  predicted  to  select  this  problem  as  a match  to  the  New  problem 
based  on  surface  features 


EXPERIMENT  3 


The  major  finding  of  Experiment  1 was  that  contextual  variability  improved  the 
likelihood  of  transferring  problem  solving  skills  to  new  domains^  The  major  finding  from 
Experiment  2 was  that  contextual  variability  also  improved  the  ability  of  participants  to 
recognize  these  principles  in  other  domains.  It  is  believed  that  contextual  variability  led 
to  the  adaptation  of  strategies  that  focused  on  invariant  information  (semantic  information 
in  these  cases)  over  the  inconsistent,  perceptual  information  given  in  the  problem.  As  a 
result,  the  VP  group’s  primary  strategy  to  solve  a transfer  problem  was  to  retrieve  a 
problem  from  memory  cued  by  semantic  information.  The  CP  group  received  training 
without  contextual  variability  and  was  believed  to  use  a strategy  that  focused  on  the  lower 
level  perceptual  information  in  the  problem  to  retrieve  a problem  from  memory.  This 
simple  strategy  proved  successful  over  the  training  the  CP  group  received. 

The  purpose  of  Experiment  3 concerned  the  CP  group.  It  was  clear  that 
contextual  variability  in  training  improved  transfer  of  problem  solving  skills.  However, 
what  could  enable  the  CP  group  to  transfer  their  skills  to  a new  domain  without  re- 
training them?  The  true  test  of  transfer  in  Experiment  1 was  the  Remote  transfer  problem 
whose  solution  required  a learned  principle  but  was  presented  in  a context  designed  to 
minimize  any  perceptual  similarity  to  another  problem.  In  the  real  world,  however, 
problems  range  in  their  visual  likeness  and  many  problems  that  share  solution  procedures 
also  appear  perceptually  similar. 


65 


66 


Experiment  3 offered  a closer  look  at  the  similarity  between  problems  and  the 
ability  of  participants  to  transfer  solutions.  In  particular,  this  study  examined  the  effect 
that  solving  a simple  transfer  problem  had  on  a remote  transfer  problem.  Holyoak  and 
Koh  (1987)  found  that  participants  were  more  likely  to  solve  an  analogical  reasoning 
problem  if  they  were  given  a story  beforehand  that  was  highly  similar  in  surface  and 
structural  features. 

If  participants  are  given  a transfer  problem  closer  in  similarity  to  a known 
problem  than  the  remote  transfer  problem,  is  it  possible  that  this  “near”  transfer  problem 
improves  performance  on  a remote  transfer  problem?  The  simpler  near  transfer  problem 
may  demonstrate  to  solvers  that  the  principles  apply  to  other  problem  domains.  This  may 
encourage  a deliberate  strategy  to  look  for  properties  in  remote  transfer  problems  that  cue 
analogous  problems  in  memory  they  might  otherwise  neglect. 

According  to  the  learning  mechanism  which  only  described  a bottom-up 
recognition  process,  the  CP3  group  is  no  different  than  the  CP  group.  No  one  trial 
exposure  to  simple  transfer  problem  would  have  a meaningful  effect  on  the  limited 
semantic  network  compared  to  the  effect  of  training.  Therefore,  the  predictions  for  the 
CP3  group  are  the  same  as  for  the  CP  group  on  transfer  and  matching. 

The  performance  of  participants  in  this  study  was  compared  to  that  of  the  CP  and 
VP  groups  in  Experiments  1 and  2 to  assess  the  effect  of  a near  transfer  problem  on 
solving  a remote  transfer  problem. 


67 


Method 

Participants 

The  participants  in  this  experiment  were  solicited  from  the  student  population  by 
advertising  in  the  student  paper  and  flyers  posted  around  the  campus.  Thirty  students 
from  the  University  of  Florida  were  recruited  and  paid  $1 0 for  their  participation.  For 
motivation,  there  was  a $30  bonus  for  the  participant  who  solved  the  most  number  of 
problems  correctly.  The  participants  ranged  in  age  from  18  to  37  with  a modal  age  of  20. 
Of  these  30  participants,  18  were  female  and  12  were  male. 

Apparatus  and  Stimuli 

The  same  laboratory  setting,  screening  procedures,  and  training  materials  were 
used  as  in  Experiment  1 . There  were  three  types  of  transfer  questions  that  the  participants 
solved  following  the  training. 

The  first  type  of  problem  was  the  Near  transfer  problem.  An  example  of  the  Near 
transfer  problem  is  given  in  Figure  14.  This  problem  was  situated  in  an  unfamiliar 
context  but  involved  labeling  very  close  to  one  of  the  rules  learned  during  training. 
Specifically,  this  problem  required  the  difference  rule  for  the  solution  and  was  presented 
in  the  context  of  a water  treatment  system.  However,  the  labeling  on  the  diagram  was 
borrowed  from  the  circuit  problem,  which  was  the  context  in  which  this  group  practiced 
the  difference  rule.  The  labeling  was  not  identical  to  that  used  in  the  circuit  problem  but 
very  similar.  As  a result,  this  problem  was  a kind  of  transfer  problem  that  contained  very 
similar  verbal  cues  for  retrieving  the  appropriate  analogous  problem  from  memory. 

The  purpose  of  this  problem  was  to  test  whether  participants  could  solve  a transfer 
problem  if  it  was  partially  similar  in  context  to  a problem  from  training.  From  the 


68 

perspective  of  the  learning  mechanism,  the  labeling  in  the  Near  problem  would  provide 
recognizable  surface  features  that  cue  the  retrieval  of  circuit  problems  and  the  difference 
rule  associated  with  them  during  training. 

The  second  type  of  problem  was  the  classic  Remote  transfer  problem  given  in 
Experiment  1 . It  required  the  compliance  rule  to  provide  the  solution  but  was  situated  in 
the  unfamiliar  context  of  a water  treatment  system.  An  example  of  the  Remote  transfer 
problem  is  given  in  Figure  5.  The  purpose  of  this  problem  was  to  test  whether 
participants  were  any  better  at  solving  this  remote  transfer  problem  after  attempting  to 
solve  a near  transfer  problem  than  the  CP  group  was  in  Experiment  1 . 

The  third  type  of  problem  in  this  transfer  task  was  the  same  Combination  problem 
from  Experiment  1 involving  a familiar  domain  but  requiring  an  unfamiliar  principle  for 
the  solution.  The  solution  to  this  problem  required  a novel  rule  involving  the 
combination  of  both  learned  principles.  This  problem  was  situated  in  the  familiar  context 
of  a respiratory  system.  An  example  of  this  problem  is  given  in  Figure  4.  The  purpose  of 
this  problem  was  to  test  whether  participants  were  as  likely  to  make  the  predictable 
mistake  on  this  problem  after  attempting  to  solve  both  the  Near  and  Remote  transfer 
problems  as  the  CP  group  in  Experiment  1 . 

Procedure 

The  training  given  to  participants  in  Experiment  3 (CP3)  was  the  same  as  that 
given  to  the  CPa  group  in  Experiment  1 . Since  there  were  no  differences  between  the 
counterbalanced  groups  in  Experiments  1 or  2,  counterbalancing  was  not  employed  in 


Experiment  3. 


69 


As  with  the  training  in  Experiment  1 , participants  were  screened  for  basic  algebra 
skills  and  prior  familiarity  with  the  problem  domains.  Furthermore,  any  participant  who 
failed  to  achieve  70%  accuracy  over  the  30  training  problems  was  dropped  from  the 
study. 

Following  the  training  phase,  participants  attempted  to  solve  the  three  novel 
problems.  These  transfer  problems  were  presented  in  a specific  order.  Of  interest  was 
the  effect  of  the  Near  transfer  problem  on  accuracy  on  the  Remote  transfer  problem.  For 
this  reason,  the  Near  transfer  problem  had  to  precede  the  Remote  transfer  problem.  The 
performance  of  this  group  on  the  Remote  transfer  and  Combination  problem  were 
compared  to  the  responses  of  the  CP  and  VP  groups  in  Experiment  1 . 

As  in  Experiment  1 , participants  made  numeric  response  to  the  transfer  problems 
and  were  asked  to  provide  a brief  description  of  what  they  thought  as  the  attempted  to 
solve  the  problem.  They  gave  these  retrospective  protocols  by  typing  them  into  a field  in 
the  computer  program  that  administered  the  experiment.  The  protocol  data  was  not 
intended  to  provide  formal  data  but  possible  insight  into  the  strategies  used  by 
participants  to  solve  the  problems. 

Following  the  transfer  problems,  all  participants  completed  a matching  task  as  in 
Experiment  2.  Like  Experiment  2,  participants  were  again  presented  each  of  the  three 
transfer  problems  they  attempted  to  solve  earlier.  They  were  asked  to  select,  from  a set  of 
six  new  problems,  the  one  that  had  the  most  similar  solution  procedure  as  the  particular 
transfer  problem.  Participants  were  also  asked  to  provide  brief  descriptions  of  what  they 
thought  as  they  performed  each  task.  The  categorical  selections  made  by  the  group  for 


70 

each  trial  were  recorded  and  the  distribution  of  responses  was  compared  to  the 
distribution  of  responses  for  the  CP  and  VP  groups  in  Experiment  2. 

The  purpose  of  this  task  was  to  assess  the  ability  of  participants  to  recognize  the 
principles  in  new  contexts.  This  would  suggest  whether  or  not  the  participants  were  able 
to  recognize  the  principles  in  new  situations  after  attempting  to  solve  the  Near  and 
Remote  transfer  problems  any  better  than  the  CP  group  in  Experiment  2. 

For  both  the  transfer  and  matching  tasks,  response  times  were  measured  from  the 
presentation  of  the  problem  to  the  first  keystroke  entered  for  the  solution.  Participants 
were  informed  that  response  times  would  be  measured  but  instructed  to  focus  on 
accuracy. 

Results 

For  the  transfer  task  in  Experiment  3,  the  majority  of  participants  were  expected 
to  solve  the  Near  transfer  problem  accurately.  The  labeling  in  the  Near  problem  would 
provide  recognizable  surface  features  that  cued  the  retrieval  of  circuit  problems  and  the 
difference  rule  associated  with  them. 

Although  the  CP3  group  was  predicted  to  solve  the  Near  transfer  problem  based  on 
recognition  of  the  terms,  they  were  predicted  to  match  this  problem  based  on  surface 
features.  While  the  features  used  in  the  Near  problem  cue  the  appropriate  base  problems 
from  memory,  the  CP3  group  should  not  be  any  better  than  the  CP  group  at  recognizing 
semantic  similarities  in  the  alternative  problems  that  would  indicate  the  required 
principle. 

According  to  the  learning  mechanism,  there  was  no  difference  in  retrieval  process 
between  the  CP3  and  CP  groups.  Therefore,  the  same  transfer  and  matching  predictions 


71 


are  made  for  both  groups  on  the  Remote  and  Combination  problem.  If  participants  in  this 
experiment  were  any  better  at  solving  the  Remote  problem  than  their  counterparts  in 
Experiment  1 , this  would  suggest  that  solvers  may  be  led  to  use  deliberate  transfer 
strategies  without  explicit  instructions  or  hints. 

Training 

A summary  of  training  performance  comparisons  between  the  CP3,  CP,  and  VP 
groups  is  given  in  Table  7.  An  alpha  level  of  0.05  was  used  for  all  statistical  tests.  No 
participants  were  dropped  for  failing  to  achieve  the  70%  accuracy  criteria. 

The  average  number  of  training  problems  solved  correctly  by  the  CP3  group  was 
26.03  {SD  = 1 .94)  out  of  30.  The  CP  group  solved  an  average  of  27. 1 6 (SD  = 3.03) 
correctly.  There  was  no  difference  between  these  two  groups  for  number  of  problems 
solved  correctly  during  training  (t  (59)  = 2.00,  p > 0.05).  The  VP  group  solved  an 
average  of  23.57  (SD  = 3.81)  correctly.  Repeating  the  findings  in  Experiment  1,  the  CP3 
grouped  solved  more  problems  correctly  than  the  VP  group  during  training  (t  (58)  = 2.00, 
p < 0.005). 

.The  response  time  data  produced  similar  results.  The  mean  completion  time  to 
solve  all  30  training  problems  for  the  CP3  group  was  9.34  (SD  = 3.24)  minutes.  The  CP 
group  completed  the  training  phase  in  an  average  time  of  9.94  (SD  = 2.84)  minutes. 

There  was  no  difference  between  these  two  groups  in  completion  time  for  the  training 
phase  (U  (30,  31)  = 404,  p > 0.05).  The  VP  group  completed  the  training  phase  in  an 
average  time  of  1 3.02  (SD  = 5.45)  minutes.  This  was  slower  than  the  CP3  group  (U  (30, 
30)  = 245,  p<  0.01). 


72 


Testing 

Transfer  results 

Following  the  training  phase,  participants  proceeded  to  the  transfer  task.  Table  8 
shows  a summary  of  group  accuracy  and  response  times  for  CP 3 on  the  three  transfer 
problems  in  Experiment  3 and  comparisons  between  the  CP3  group  and  the  CP  and  VP 
groups  from  Experiment  1 . 

Near  transfer  problem.  On  the  Near  transfer  problem  (Figure  14),  the  proportion 
of  the  CP  group  that  solved  the  problem  correctly  was  0.79.  Since  the  Near  transfer 
problem  was  not  used  in  Experiment  1 , no  comparisons  can  be  made. 

Remote  transfer  problem.  The  classic  test  for  transfer  of  problem  solving  was  the 
Remote  transfer  problem  illustrated  in  Figure  5.  This  was  the  same  Remote  transfer 
problem  used  in  Experiment  1 . The  proportion  of  participants  in  the  CP 3 group  giving 
the  correct  answer  to  the  Remote  transfer  problem  in  Experiment  3 was  0.52.  In  contrast, 
the  proportion  of  participants  from  the  CP  group  from  Experiment  1 who  solved  the 
Remote  transfer  problem  correctly  was  0.09.  The  CP3  group  was  more  likely  to  solve  the 
Remote  transfer  than  the  CP  group  (x^  (E  18)  = 8.00,  p < 0.01).  There  was  no 
difference  between  the  VP  group  in  Experiment  1 and  the  CP3  group  at  solving  the 
Remote  transfer  problem  (x^  (1,  A^=  35)  = 0.71,  p > 0.05). 

An  analysis  of  the  completion  time  (in  minutes)  for  the  Remote  transfer  question 
showed  no  difference  between  the  CP  group  in  Experiment  1 and  the  CP3  group  (M  = 
2.02,  SD  = 1.74;  M=  2.97,  SD  = 2.76,  respectively,  t/(31,  30)  = 31 1,  p > 0.05). 

However,  the  VP  group  from  Experiment  1 took  less  time  than  the  CP3  group  to  solve 
this  problem  (M=  1.69,  SD  = 1.44;  M=  2.97,  SD  = 2.76,  respectively,  U (30,  30)  = 273,  p 


73 

< 0.05).  While  the  CP3  group  was  as  accurate  as  the  VP  group  at  solving  the  Remote 
transfer  problem,  it  took  them  longer. 

Combination  problem.  The  Combination  problem  (Figure  4)  required  both 
principles  be  used  together.  As  with  the  CP  group,  no  member  of  the  CP3  group  solved 
this  problem  correctly. 

However,  the  proportion  of  CP3  members  giving  the  predicted  incorrect  response 
by  applying  the  difference  principle  was  0.76.  The  proportion  of  members  from  the  CP 
group  in  Experiment  1 who  gave  the  predicted  incorrect  response  to  the  same  problem 
was  0.90.  There  were  no  significant  differences  between  these  two  groups  (x^  (1,  V=  44) 
= 2.27,  p > 0.05).  For  VP  group  in  Experiment  1,  this  proportion  was  0.70.  There  was  no 
difference  between  the  CP3  group  and  the  VP  group  either  (x^  (1,  V=  38)  = 0.42,  p > 
0.05). 

A comparison  of  the  completion  times  (in  minutes)  for  the  Combination  transfer 
question  showed  that  the  CP  group  took  less  time  to  solve  this  problem  than  the  CP3 
group  (M=  0.69,  SD  ==  0.78;  M=  1.00,  SD=  \ .\4,  respectively,  f/(31, 30)  = 289,  p < 
0.05).  There  was  no  difference  between  the  VP  group  and  the  CP3  group  (M=  0.96,  SD  = 
0.64;  M=  \.00,SD=  1.14,  respectively,  U (30,  30)  = 413,  p > 0.05). 

Transfer  discussion 

The  focus  in  Experiment  3 was  on  the  CP  group  and  possible  alternative  transfer 
strategies.  Many  distinct  processes  could  enable  one  to  recognize  an  opportunity  to  apply 
a solution  learned  in  one  context  to  a problem  in  another.  The  purpose  of  this  experiment 
was  to  create  a situation  that  fostered  a deliberate  strategy  for  transferring  problem 


solving  procedures. 


74 

In  Experiment  1,  the  VP  group  was  significantly  better  at  solving  the  Remote 
transfer  problem  than  the  CP  group.  This  effect  was  attributed  to  the  differences  in 
practice  between  the  groups.  However,  in  Experiment  3 we  saw  a group  (CP3)  given  the 
same  CP  training  as  before  but  solved  the  Remote  transfer  problem  with  the  same  level  of 
accuracy  as  the  VP  group.  The  CP3  group  took  longer  to  solve  the  Remote  transfer 
problem  than  either  the  CP  or  VP  group.  This  could  suggest  additional  processing  if 
supported  by  other  evidence.  The  only  difference  between  these  two  CP  groups  was  the 
presentation  of  a simple  transfer  problem  preceding  the  Remote  transfer  problem.  It 
seems  that  solving  a transfer  problem  close  in  similarity  to  a known  problem  can  improve 
performance  on  a remote  transfer  problem. 

Comparing  the  three  groups  on  the  Combination  problem  revealed  some 
differences  in  performance.  The  CP3  performance  was  no  different  from  its  counterpart 
(CP)  in  Experiment  1 . The  CP  group  appeared  more  likely  to  use  a rule  predictably  to 
solve  the  Combination  problem  while  the  responses  both  the  VP  and  CP3  groups  showed 
a wider  degree  of  variation.  However,  there  were  no  significant  differences.  There  were 
however,  differences  between  the  CP3  and  CP  group  in  response  time  on  the  Combination 
problem.  This  echoes  the  pattern  that  the  CP3  group  took  slightly  longer  than  the  CP  and 
VP  groups  to  solve  the  transfer  problems  in  general. 

From  this  set  of  transfer  tasks,  it  is  clear  that  the  presentation  of  a similar  transfer 
problem  can  improve  performance  on  more  remote  transfer  problems.  It  is  possible  that 
in  the  Near  transfer  problem,  familiar  features  in  the  problem  (labels  in  this  case)  are 
recognized  and  cue  the  retrieval  of  a known  problem  and  its  solution.  Once  the  Near 
transfer  problem  is  solved,  perhaps  a transfer  strategy  is  generalized.  In  this  transfer 


75 


strategy,  novel  problems  may  be  thoroughly  inspected  for  their  similarity  to  previously 
learned  problems.  This  would  be  a different  process  than  outlined  by  the  learning 
mechanism.  The  learning  mechanism  clearly  failed  to  predict  the  performance  of  the  CP3 
group  on  the  Remote  transfer  problem. 

Evaluating  the  CP3  group  over  the  course  of  training  and  the  transfer  task,  they 
performed  better  than  the  VP  group.  They  were  more  accurate  during  training,  completed 
the  training  in  less  time,  and  performed  as  accurately  on  the  transfer  tasks,  albeit  slightly 
slower.  Given  this  perspective,  is  there  any  benefit  to  the  contextual  variability  in 

practice  if  the  training  is  more  difficult? 

According  to  the  learning  mechanism,  the  contextual  variability  the  VP  group 
received  during  training  would  make  them  better  at  recognizing  the  similarity  in  semantic 
information  between  problems  than  the  CP3  group.  It  seems  that  after  the  CP3  and  VP 
group  solved  a transfer  problem,  either  Near  or  Remote,  they  learn  that  solutions  from 
one  kind  of  problem  can  be  applied  to  another.  Thus,  both  groups  may  learn  and  use 
transfer  strategies  deliberately.  The  performance  of  the  two  groups  on  the  transfer  tasks 
shows  no  significant  differences.  If  the  performance  of  the  VP  group  on  the  transfer  tasks 
was  solely  due  to  the  use  of  a deliberate  transfer  strategy,  then  we  expect  no  difference 
between  the  CP3  and  VP  group  during  the  matching  tasks. 

Alternatively,  it  could  be  that  the  Near  transfer  problem  made  some  of  the  CP3 
group  more  sensitive  to  the  demand  characteristics  of  the  transfer  task.  Rather  than 
spontaneously  recognizing  the  embedded  principle  in  the  Remote  transfer  task,  they  may 
have  tried  to  map  both  rules  learned  during  training  into  the  problem.  This  alternative 


76 


strategy  could  also  explain  the  increased  response  time  for  the  CP3  group  compared  to  the 
CP  and  VP  groups. 

Matching  results 

After  completing  the  three  transfer  tasks,  participants  moved  on  to  the  matching 
tasks.  It  was  shown  in  the  transfer  tasks  that  the  CP3  group  in  Experiment  3 solved  the 
Remote  transfer  problem  as  accurately  as  the  VP  group  and  without  the  benefit  of 
contextual  variability  in  training.  Although  the  groups  solved  the  Remote  transfer 
problem  equally  well,  it  is  possible  they  achieved  this  through  different  processes.  If  the 
CP3  and  VP  group  used  the  same  processes,  both  groups  could  exhibit  the  same  matching 
strategy.  The  matching  tasks  were  again  used  in  attempt  to  gain  insight  into  the 
participants’  strategies  for  solving  the  transfer  problems.  Of  interest  was  the  comparison 
of  matching  patterns  between  the  training  groups  in  Experiment  2 and  the  CP3  group. 

According  to  the  learning  mechanism,  there  was  no  difference  in  retrieval  process 
between  the  CP3  and  CP  groups.  Therefore,  the  same  matching  predictions  were  made 
for  both  groups  on  the  Remote  and  Combination  problems.  While  the  terms  used  in  the 
Near  problem  cued  the  appropriate  analogous  problems  from  memory,  the  CP3  group 
would  not  be  any  better  than  the  CP  group  at  recognizing  semantic  similarities  in  the 
alternative  problems  that  indicated  the  required  principle. 

For  each  matching  task,  two  proportions  were  compared;  the  proportion  of  the 
group  selecting  the  alternative  problem  with  the  same  surface  features  and  the  proportion 
of  the  group  selecting  the  alternative  problem  requiring  the  same  principle. 

Matching  to  Near  transfer  problem.  The  response  distribution  of  matches  to  the 
Near  transfer  problem  is  shown  in  Figure  15.  The  alternative  problem  that  had  the  same 


77 


features  (Figure  16)  as  the  Near  transfer  problem  was  selected  the  most  (0.55).  The 
alternative  problem  requiring  the  same  rule  (Figure  17)  was  chosen  by  0.38  of  the  group. 
The  majority  of  the  CP3  group  selected  the  alternative  problem  with  similar  features. 

As  seen  in  Experiment  1,  comparing  the  matching  responses  of  group  members 
who  solved  the  Remote  transfer  problem  correctly  to  those  who  solved  it  incorrectly 
helped  explain  some  of  the  variance  in  the  matching  data.  As  can  be  seen  in  Figure  1 8, 
most  of  the  members  who  matched  the  Near  transfer  problem  to  the  alternative  problem 
requiring  the  same  principle  also  solved  the  Remote  transfer  problem  correctly.  Most  of 
the  members  who  matched  the  Near  transfer  problem  to  the  alternative  problem  having 
the  same  features  failed  to  solve  the  Remote  transfer  problem.  Since  the  CP  and  VP 
groups  in  Experiment  2 did  not  receive  the  Near  Transfer  problem,  no  comparisons  can 
be  made. 

Matching  to  remote  transfer  problem.  The  Remote  problem  represented  the 
classic  transfer  problem.  The  CP3  group  was  significantly  more  accurate  at  solving  the 
Remote  problem  in  the  transfer  task  than  the  CP  group.  In  addition,  there  was  no 
statistical  difference  between  the  CP3  group  and  the  VP  group  from  Experiment  1.  Of 
interest  in  the  Remote  matching  problem  was  how  the  matching  performance  of  the  CP3 
group  compared  to  the  CP  and  VP  groups  in  Experiment  2. 

The  response  distributions  for  all  3 groups  are  compared  in  Figure  19.  A 
summary  of  group  performance,  response  times,  and  statistical  comparisons  is  given  in 
Table  9.  When  it  came  to  matching  the  Remote  transfer  problem  with  an  alternative 
problem  that  seemed  the  most  useful  for  providing  a solution  to  the  target  problem,  the 
proportion  of  the  CP3  group  who  selected  the  problem  with  the  same  features  was  0.48. 


78 


This  described  the  majority  of  the  group’s  response.  The  proportion  of  the  group  who 
selected  the  alternative  problem  that  required  the  same  principle  as  the  Remote  transfer 
problem  was  0.14.  There  was  little  difference  in  selection  responses  between  those  who 
solved  the  Remote  transfer  problem  correctly  and  those  who  did  not. 

In  Experiment  2,  the  CP  group  was  more  likely  than  the  VP  group  to  match  the 
Remote  problem  with  an  alternative  problem  having  the  same  features.  The  proportion  of 
the  CP  group  in  Experiment  2 who  made  a match  corresponding  to  features  (0.74)  was  no 
different  than  the  proportion  in  the  CP3  group  (\,N=  37)  = 2.19,  p > 0.05). 

Both  the  CP  and  CP3  groups  were  more  likely  than  the  VP  group  to  select  the 
alternative  problem  with  the  same  features  as  the  Remote  problem.  The  proportion  of  the 
VP  group  from  Experiment  2 who  made  this  selection  (0.17)  was  less  than  the  proportion 
in  the  CP3  group  (x^  (1,  19)  = 4.26,  p < 0.05). 

Comparing  the  groups  on  matches  by  principle  showed  similar  results. 

Experiment  2 found  that  the  VP  group  was  more  likely  than  the  CP  group  to  match  the 
Remote  transfer  problem  to  an  alternative  problem  requiring  the  same  principle.  The 
proportion  of  the  CP  group  from  Experiment  2 that  made  this  match  was  0.06.  As  stated 
earlier,  0.14  of  the  CP3  group  selected  this  problem  as  well.  There  was  no  difference 
between  these  two  groups  (x^  (1,  V^=  53)  = 0.17,  p > 0.05). 

The  VP  group  was  more  likely  than  the  CP  and  CP3  groups  to  select  the 
alternative  problem  requiring  the  same  principle  as  the  Remote  transfer  problem.  The 
proportion  of  the  VP  group  from  Experiment  2 who  made  this  selection  (0.57)  was  larger 
than  the  proportion  of  the  CP3  group  (x^  (1,  A^=  38)  = 3.89,  p < 0.05). 


79 


Matching  to  combination  problem.  This  task  was  the  same  Combination 
matching  problem  used  in  Experiment  2.  After  all  the  data  was  collected,  a confound  was 
discovered  in  the  alternative  matching  problem  having  the  same  surface  features  as  the 
Combination  problem.  This  alternative  problem  required  the  compliance  rule  which 
could  have  been  applied  to  the  Combination  problem  even  though  it  provided  an  incorrect 
response.  The  response  distributions  for  the  CP  and  VP  groups  from  Experiment  2 and 
the  CP3  group  are  compared  in  Figure  19.  A summary  of  group  performance,  response 
times,  and  statistical  comparisons  is  given  in  Table  9. 

In  matching  the  Combination  problem  with  an  alternative  problem  that  seemed 
the  most  useful  for  providing  a solution,  the  proportion  of  the  CP3  group  who  selected  the 
problem  with  the  same  features  was  0.48.  This  described  the  majority  of  group’s 
response.  The  training  the  CP3  group  received  coupled  the  respiratory  system  with  the 
difference  rule  and  no  responses  to  the  Combination  transfer  problem  suggested  any 
participants  used  the  compliance  rule.  Therefore,  it  was  presumed  that  participants  in  the 
CP3  group  who  chose  the  respiratory  problem  from  the  alternative  set  selected  the  match 
based  on  surface  features  and  not  the  compliance  principle. 

In  Experiment  2,  the  proportion  of  the  CP  group  who  matched  the  Combination 
problem  with  the  alternative  problem  having  the  same  features  was  0.65.  There  was  no 
difference  between  the  CP  or  CP3  groups  (1,  V=  34)  = 1.06,  p > 0.05).  However,  it  is 
possible  that  the  proportion  of  the  CP  group  who  matched  based  on  feature  was  biased. 

The  proportion  of  the  VP  group  from  Experiment  2 who  selected  the  match 
corresponding  to  features  was  0.57.  However,  it  is  very  likely  that  this  proportion 


80 


includes  matches  based  on  the  compliance  principle,  surfaces  features,  and  the  influence 
of  both.  Therefore,  comparisons  against  this  data  were  omitted. 

The  proportions  of  the  CP3,  CP,  and  VP  groups  who  selected  the  alternative 
problem  whose  solution  involved  the  combination  of  both  rules  were  0.07,  0.10,  and  0.07. 
There  was  no  difference  between  the  CP3  group  and  either  the  CP  or  VP  groups  in 
Experiment  2 {^,N=  53)  = 0.17,  p > 0.05,  53)  = 0.17,  p > 0.05, 

respectively). 

Matching  discussion 

In  the  transfer  tasks,  CP3  performed  better  than  the  VP  group  on  training  and 
generally  as  well  on  the  Remote  transfer  problem.  However,  there  were  significant 
differences  in  which  problems  the  two  groups  indicated  were  the  most  similar  to  the 
transfer  problems.  This  could  suggest  different  strategies  in  the  matching  task. 

The  CP3  group  did  well  at  solving  the  Near  transfer  problem  with  79%  of  the 
members  solving  the  problem  correctly.  Only  38%  of  the  group  matched  the  Near 
transfer  problem  to  the  alternative  problem  that  required  the  same  principle  for  the 
solution.  Comparing  these  two  percentages  gave  the  CP3  group  a principle-match  to 
transfer  ratio  of  0.48.  Predictions  according  to  theory  were  that  the  CP3  group  would 
match  the  Near  problem  based  on  surface  features.  Fifty-five  percent  of  the  group  did. 
However,  it  cannot  account  for  the  38%  of  the  group  that  selected  the  alternative  problem 
requiring  the  same  principle. 

The  classic  test  of  transfer  was  the  Remote  problem.  This  problem  proved  to  be 
more  difficult  than  the  Near  transfer  problem  and  only  52%  of  the  CP3  group  solved  it 
correctly.  Only  14%  of  the  group  chose  the  alternative  problem  requiring  the  same 


81 

embedded  principle  as  the  Near  transfer  problem.  This  was  no  different  than  the  CP 
group.  Comparing  the  percent  matching  by  principle  to  the  percent  transferring  in  the 
CP3  group  gives  a ratio  of  0.27  and  indicates  a large  drop  from  the  Near  transfer  problem. 

The  VP  group  was  equivalent  in  accuracy  to  the  CP3  group  on  the  Remote  transfer 
problem  and  had  67%  group  accuracy.  However,  more  VP  group  members  selected  the 
alternative  problem  that  required  the  same  principle.  The  majority  of  the  VP  group  (57%) 
matched  based  on  principle.  The  principle-based  matching  to  transfer  ratio  for  the  VP 
group  was  0.85.  Compared  the  CP3  group,  this  shows  the  advantage  of  the  VP  group’s 
training. 

The  Remote  problem  clearly  showed  a difference  in  matching  strategy  between 
the  CP3  and  VP  groups.  While  the  CP3  group  was  able  to  recognize  the  principle 
embedded  in  the  Remote  transfer  problem,  they  were  not  as  good  as  the  VP  group  at 
recognizing  this  principle  across  the  six  alternative  problems  in  the  Remote  matching 
task.  This  suggests  that  performance  of  the  VP  group  was  not  based  solely  on  the  use  of  a 
deliberate  comparison  strategy. 

On  both  the  Near  and  Remote  problems,  more  CP3  members  were  able  to  solve 
the  transfer  problems  than  could  pick  out  the  alternative  problem  with  the  matching 
principle.  However,  the  CP3  group  did  better  at  selecting  matches  by  principle  in  the 
Near  problem  than  on  the  Remote  problem.  It  is  possible  that  participants  were  less 
certain  of  their  solutions  to  the  Remote  problem  and  were  more  inclined  to  default  to  a 
feature-based  match.  Figure  1 8 shows  the  matching  pattern  of  the  CP3  group  to  the  Near 
transfer  problem  broken  down  by  those  who  solved  the  Remote  transfer  problem  and 


82 


those  who  did  not.  The  majority  of  the  CP3  group  who  selected  the  feature-based  match 
failed  to  solve  the  Remote  transfer  problem  correctly. 

How  was  it  that  almost  half  of  the  CP3  group  solved  the  Remote  transfer  problem 
by  transferring  a principle  learned  from  one  domain  to  a new  domain  but  did  not  seem  to 
recognize  the  same  principle  in  other  problems  as  indicated  the  group  s performance  in 
the  matching  task?  Perhaps  the  Near  transfer  problem  brought  to  the  attention  of  the  CP3 
group  that  the  principles  could  be  applied  to  problems  other  than  those  in  which  they 
were  learned.  With  this  strategy  made  apparent,  group  members  may  have  explicitly  tried 
to  decide  which  of  the  two  rules  they  learned  could  be  used  to  solve  the  Remote  transfer 
problem.  It  would  be  easier  to  attempt  to  map  two  principles  into  one  transfer  problem  to 
see  if  one  produced  a solution  than  to  attempt  to  map  one  principle  into  six  alternative 
problems  to  find  the  one  that  was  solvable  by  that  principle.  Therefore,  the  CP3  group 
may  have  been  more  accurate  with  the  Remote  transfer  problem  because  the  problem 
space  was  smaller  than  with  the  matching  task.  The  response  times  of  the  CP3  group 
were  slightly  longer  in  the  transfer  task  which  one  would  expect  if  they  used  a trial  and 
error  strategy.  However,  no  response  time  differences  were  found  between  the  CP3,  CP, 
and  VP  groups  in  the  matching  task. 

Turning  to  the  VP  group,  the  learning  mechanism  claimed  that  the  contextual 
variability  they  learned  the  principles  in  forced  them  to  adopt  a strategy  where  they  attend 
to  semantic  information  in  problems  first.  Therefore,  it  was  believed  that  the  VP  group 
would  be  more  likely  to  recognize  the  principle  required  by  the  Remote  transfer  problem 
across  domains.  This  prediction  was  supported. 


83 


The  transfer  and  matching  tasks  in  Experiment  3 demonstrated  several  limitations 
of  the  learning  mechanism  in  describing  the  early  stages  of  the  transfer  process.  It  failed 
to  predict  the  CP3  group’s  ability  to  solve  the  Remote  transfer  problem  or  to  predict  their 
matching  performance  to  the  Near  transfer  problem.  An  explanation  of  the  early  stages  of 
transfer  based  on  increased  activation  of  semantic  categories  and  adaptive  attention 
cannot  account  for  the  recognition  of  transfer  opportunities  in  all  circumstances. 

Together,  the  results  of  this  matching  task  suggest  that  some  of  the  CP3  group 
were  able  to  recognize  embedded  principles  in  problems  when  they  were  sure  what  the 
principle  was.  The  VP  group  was  better  at  recognizing  embedded  principles  overall. 
While  both  groups  were  aware  they  could  use  transfer  strategies,  there  was  an  added 
value  for  recognizing  transfer  opportunities  with  contextual  variability  in  practice. 

One  could  argue  that  the  VP  group  practiced  this  method  and  so  they  should  be 
better.  The  difference  between  the  two  groups,  then,  is  not  in  the  kind  of  strategy  they 
use  but  their  proficiency  with  it.  Perhaps  the  more  interesting  result  was  that  the  CP3 
group  did  as  well  as  they  did  without  the  practice  that  the  VP  group  received.  From  the 
one  trial  exposure  to  the  Near  transfer  problem,  38%  of  the  CP3  group  matched  this 
problem  with  a problem  having  the  same  principle.  The  one  trial  exposure  also  led  to  a 
41%  increase  in  accuracy  on  the  Remote  transfer  problem  compared  to  the  CP  group,  a 
8%  increase  in  selecting  a match  with  the  same  principle,  and  a 26%  decrease  in  selecting 
a match  with  the  same  features  in  the  Remote  transfer  problem. 

In  conclusion,  the  Near  transfer  problem  was  very  effective  at  improving  the  CP3 
group’s  ability  to  transfer  solutions  and,  in  some  cases,  improved  their  ability  to 
recognize  transfer  opportunities  beyond  surface  level  feature  matches.  Their  predominant 


84 


strategy  in  the  matching  task,  however,  was  to  select  problems  that  shared  the  same 
surface  features  as  the  target  problem. 


85 


Tank  Elasticity  = 3 


Pul  (B)  * 45 


Change  in  Pull » -7 


Figvire  14 

The  Near  Transfer  problem  required  a known  rule  (difference  rule)  in  an  unfamiliar 
context  (water  treatment  system)  but  had  semantic  cues  in  the  labeling 


86 


Table  7 

Summary  and  Comparison  of  Training  Performance  Between  Groups  in  Experiment  I and  Experiment  3 


Training  Condition 

Number  Correct 

Completion  Time  (min) 

M(SD) 

M(SD) 

CP3 

26.03  (1.94) 

9.34  (3.24) 

CP 

27.16(3.03) 

* 

9.94  (2.84) 
* 

CP3 

26.03  (1.94) 

9.34  (3.24) 

VP 

23.57  (3.81) 

13.02  (5.45) 

t (58)  = 2.00,  p < 0.005 

(7(30,  30)  = 245,  p<  0.01 

* No  statistical  difference  between  groups 


87 


Table  8 

Summary  and  Comparison  of  Group  Accuracy  Between  Training  Groups  from  Experiment  1 and  Experiment 
3 on  the  Remote  and  Combination  Transfer  Problems  


Training 

Condition 

Transfer  Condition 

Remote 

Combination 

Proportion  of  group 

Response  Time  (Min) 
M(SD) 

Proportion  of  Group 
(as  Predicted’’) 

Response  Time  (Min) 
M(SD) 

CPj 

0.52 

2.97  (2.76) 

0.76 

1.00(1.14) 

CP 

0.09 

2.02(1.74) 

0.90 

0.69  (0.78) 

x'(l,N=18)=8.00, 

* 

* 

(7(31, 30)  = 289, 

p<0.01 

p<0.05 

CPj 

0.52 

2.97  (2.76) 

0.76 

1.00(1.14) 

VP 

0.67 

1.69(1.44) 

0.70 

0.96  (0.64) 

* 

(7(30,  30)  = 273.00, 

* 

* 

p<0.05 

‘ For  the  Combination  problem,  the  comparison  of  predicted  responses  was  favored  over  true  accuracy.  See 
text  for  details 

* No  statistical  difference  between  groups 


Number  of  Observations 


88 


30  , 

1 

25  - 

I 

20  h 


Similarity  Match  Selection 
(F  = feature,  P = principle,  d = distracter) 


Figure  15 

Frequency  distribution  of  similarity  matches  by  CP3  group 
to  the  Near  transfer  problem 


89 


Shift  in  Pull  from  Tank  2 = 3 


Fricitional  Opposition  of  Pipe  = 7 


Shift  in  Pull  from  Tank  1 = 


Flexibaily  of  Tank  1 = 5 


Sewage  Drift  Rate  » 30 


Alternative  problem  having  the  same  features  as  the  Near  transfer  problem  in  Experiment 
3 


Figure  17 

Alternative  problem  requiring  the  same  embedded  principle  as  the 
Near  transfer  problem  in  Experiment  3 


Number  of  Observations 


91 


30 

25 

20 

15 

10 

5 

0 


Similarity  Match  Selection 
(F  = feature,  P = principle,  d = distracter) 


Figure  18 

Frequency  distribution  of  similarity  matches  made  by  the  CP3  group 
to  the  Near  transfer  problem  broken  down  by  members  that  solved 
the  Remote  transfer  problem  correctly  and  those  who  did  not 


Number  of  Observations 


92 


Combination 


Similarity  Match  Selection 
(F  = feature,  P = principle,  d = distracter) 

Figure  19 

Frequency  distribution  of  similarity  matches  made  by  the  VP  and  CP  groups  in 
Experiment  2 and  the  CP3  group  in  Experiment  3 to  the  Remote  and 
Combination  transfer  problems 

^ In  the  Combination  problem,  the  alternative  problem  with  matching  features 
also  required  the  compliance  principle 
^ This  problem  required  the  difference  principle 
' This  problem  required  the  actual  combination  principle 


93 


Table  9 


Comparison  of  Matching  Responses  Between  Training  Groups  in  Experiment  2 and  Experiment  3 on  the 
Remote  and  Combination  Matching  problems 


Training 

Condition 

Matching  Condition 
(Proportion  of  Group) 

Completion  Time 
(min) 

Remote 

Combination 

M(SD) 

Features 

Principle 

Features 

Principle 

CP3 

0.48 

0.14 

0.48 

0.07 

3,22(1.72) 

CP 

0.74 

0.06 

0.65 

0.10 

2.71  (2.40) 

♦ 

* 

* 

* 

CP3 

0.48 

0.14 

0.48 

0.07 

3.22(1.72) 

VP 

0.17 

0.57 

0.57 

0.07 

2.65(1.73) 

X^(1,V=19)  = 

4.26,  x^(l,  V=38)  = 3.89, 

Omitted’ 

♦ 

p < 0.05 

p < 0.05 

* No  statistical  difference  between  groups. 

“ The  comparison  between  the  CP3  and  VP  groups  was  omitted  due  to  a confound  in  the  Combination 
matching  task.  See  text  for  details. 


DISCUSSION 


The  purpose  of  these  studies  was  to  explore  recognition  processes  in  the  early 
stages  of  transfer.  Among  the  various  theories  of  transfer  is  an  agreement  that  the  early 
stages  involve  the  internal  representation  of  the  problem  and  the  activation  and  retrieval 
of  similar  problems  from  memory  (Reeves  & Weisberg,  1994;  Novick,  1988).  The 
representation  a solver  forms  of  the  problem  frames  the  solution  space  explored  and  is  a 
key  determinant  of  their  solution  (Elio  & Anderson,  1981).  Therefore,  the  likelihood  of 
transfer  is  affected  by  the  similarity  between  the  solver’s  representation  and  problems 
stored  in  memory  (Kotovsky,  Hayes,  & Simon,  1985;  Novick,  1988). 

Studies  of  expert/novice  differences  show  differences  in  how  each  group 
categorizes  problems  (Chi  et  al.,  1981;  Adelson,  1981;  Lesgold,  Rubinson,  Feltovich, 
Glaser,  Klopher,  & Wang,  1988;  Schoenfeld  & Hermann,  1982;  Hardiman,  Duffesne,  & 
Mestre,  1989).  Experts  tend  to  categorize  problems  by  similarities  at  a higher  level  such 
as  solution  procedure  while  novices  categorize  problems  according  to  surface  level 
features.  As  a result,  experts  often  transfer  solution  procedures  across  problem  domains 
better  than  novices. 

While  comparisons  between  experts  and  novices  provide  insights  into  the 
development  of  skill,  these  studies  seem  inherently  confounded.  Between  a group  of 
experts  and  novices  there  are  likely  to  be  many  significant  differences  not  related  to 
experience  in  the  area  of  testing.  For  example,  the  experts  in  the  Chi  et  al.  (1981)  study 


94 


95 


were  advanced  physics  graduate  students.  The  novices  were  students  who  had  recently 
completed  an  introductory  physics  course.  One  would  expect  to  find  differences  between 
these  two  groups  in  age,  years  of  education,  and  mathematical  aptitude.  In  general,  expert 
populations  are  often  self  selecting. 

One  goal  of  this  study  was  to  isolate  and  manipulate  a single  factor,  contextual 
variability,  within  the  development  of  expertise.  In  this  way,  the  numerous  dimensions 
along  which  experts  and  novices  differ  could  be  controlled  and  the  contribution  of  a 
single  factor  to  performance  could  be  assessed.  The  interest  in  these  studies  was  how 
solvers  learned  to  shift  from  a reliance  on  surface  level  features  in  transfer  to  higher  level 
similarities  as  seen  in  differences  between  experts  and  novices. 

The  belief  here  was  that  the  shift  was  a result  of  how  solvers  weighed  the 
information  available  in  the  problem  and  the  related  information  that  was  cued  in 
memory.  Contextual  variability  was  used  to  create  a situation  where  novice  solvers 
would  adopt  a strategy  that  focused  attention  on  the  semantic  information  in  the 
problems.  Thus,  contextual  variability  could  be  used  to  change  how  solvers  categorize 
and  retrieve  problems  from  memory. 

The  first  thing  these  experiments  revealed  was  that  contextual  variability  in 
training  improved  transfer  of  problem  solving  skills.  Contextual  variability  has  been 
shown  to  improve  transfer  in  motor  skills  (Catalano  & Kleiner,  1984;  Kerr  & Booth, 

1 978)  but  not  problem  solving  skills.  Also,  the  difference  in  transfer  between  the  CP  and 
VP  groups  was  comparable  to  the  difference  seen  between  expert  and  novice  difference  in 
problem  solving  transfer  (Novick,  1988;  Elio  & Anderson,  1981). 


96 


The  matching  task  showed  that  contextual  variability  in  training  improved  solvers 
ability  to  recognize  principles  embedded  in  novel  problems.  The  matching  task  was  used 
to  reveal  the  recognition  process  required  in  the  transfer  tasks.  The  interest  was  in  what 
attribute  of  the  problems  was  used  by  the  solvers  to  categorize  them.  Like  the  Chi  et  al 
use  of  the  problem  sorting  task,  the  similarity  matches  the  subjects  made  indicated  the 
dimension  along  which  the  subjects  assessed  similarity.  The  categorization  strategies  of 
the  problems  suggested  differences  in  the  mental  representation  solvers  had  of  the 
problems  and  differences  in  how  the  solvers  weighed  information  in  the  problems.  While 
there  was  a confound  in  the  Combination  matching  task  that  prevented  a complete 
analysis  of  that  problem,  it  had  no  effect  on  the  major  findings  in  the  study. 

Like  the  physics  experts  in  the  Chi  et  al  study,  the  VP  group  was  able  to 
categorize  problems  in  the  matching  tasks  according  to  the  embedded  principle  in  the 
problem.  The  CP  group  consistently  matched  problems  according  to  a feature-based 
strategy  while  the  VP  group  matched  problems  according  to  a rule-based  strategy  for 
problems  where  they  were  expected  to  use  a transfer  solution.  In  cases  where  the 
problem  was  unknown,  the  VP  group  defaulted  to  a feature-based  match  like  the  CP 
group. 

A simple  learning  mechanism  rooted  in  activation-based  learning  was  proposed  to 
describe  changes  that  took  place  in  the  structure  of  memory  as  each  group  learned  to  use 
the  principles  during  training.  The  theory  accounted  for  the  transfer  differences  between 
the  two  training  groups  and  made  predictions  in  the  matching  task  that  were  supported. 
The  belief  proposed  here  was  that  the  contextual  variation  in  practice  the  VP  group 


97 


received  made  the  higher  order  information  (in  this  case,  the  semantic  information  in  the 
labels)  a more  reliable  source  of  information  in  predicting  which  principle  was  required. 

In  the  first  transfer  experiment  and  following  matching  experiment,  the  Remote 
transfer  problem  was  designed  to  minimize  any  perceptual  similarity  to  the  practice 
problems.  In  the  everyday  world,  problems  range  in  similarity  and  many  problems  that 
are  perceptually  alike  share  common  solution  procedures.  The  next  transfer  and  matching 
experiment  examined  the  effect  that  a simple  transfer  problem  had  on  the  Remote 
problem. 

The  CP3  group  was  highly  accurate  at  solving  the  Near  transfer  problem  and  was 
better  than  the  CP  group  at  solving  the  Remote  transfer  problem.  However,  the  CP3 
group  took  slightly  longer  to  solve  the  Remote  transfer  problem.  Clearly  the  Near 
transfer  problem  improved  performance  at  Remote  transfer  without  directing  the  solver  to 
use  a transfer  strategy.  The  CP3  group  appeared  to  recognize  the  embedded  principle  in 
the  Near  problem  and  adapted  it  to  fit  the  problem.  It  seemed  that  from  this  single  trial, 
participants  generalized  the  strategy  of  transferring  a principle  to  the  Remote  problem. 
While  the  Near  transfer  problem  brought  the  performance  of  the  CP3  group  up  to  the  level 
of  the  VP  group  on  the  Remote  transfer  problem,  there  were  significant  differences  in  the 
matching  performance.  The  CP3  group  did  fairly  well  at  finding  a rule-based  match  to  the 
Near  transfer  problem.  However,  they  were  no  better  than  the  CP  group  at  selecting  a 
rule-based  match  to  the  Remote  transfer  problem.  With  practice  at  solving  a wide  range 
of  problems  involving  these  principles,  the  CP3  group  would  improve  their  ability  to 
recognize  the  semantic  similarities  between  problems  as  the  VP  group  did. 


98 


The  fact  that  the  CP3  and  VP  groups  performed  the  same  on  the  transfer  task  but 
differently  on  the  matching  task  could  indicate  the  groups  used  different  processes  to 
solve  the  transfer  problem.  However,  it  is  difficult  to  make  such  distinctions  in  many 
cases.  To  say  conclusively  that  different  processes  were  involved  would  require  unique 
predictions  based  on  the  presumed  processes  and  additional  studies.  Furthermore,  it  may 
be  impossible  to  separate  VP  and  CP3  differences  from  general  practice  effects.  While 
the  slightly  longer  response  times  observed  for  the  CP3  group  in  the  transfer  tasks  could 
correspond  to  a trial  and  error  strategy,  no  RT  differences  were  found  among  the  groups 
on  the  matching  tasks.  One  would  have  expected  even  greater  RT  differences  if  a trial 
and  error  strategy  was  consistently  used  by  a majority  of  the  CP3  group  in  the  matching 
task. 

There  were  a few  methodological  issues  that  could  have  been  improved  upon  in 
these  studies  aside  from  those  previously  discussed.  Regarding  the  problems  used  in 
Experiment  3,  both  the  Near  and  Remote  problems  used  the  same  context  of  a water 
treatment  system.  The  nature  of  the  task  and  limited  number  of  principles  make  it  likely 
that  participants  in  the  CP3  group  may  have  used  trial  and  error  to  map  both  principles  in 
the  problem.  This  demonstrates  that  a variety  of  strategies  that  could  have  been  used  in 
both  the  transfer  and  matching  tasks.  While  the  predicted  results  were  generally 
observed,  the  experimental  tasks  do  not  rule  out  alternative  strategies.  Furthermore,  a 
critical  assumption  was  made  in  these  studies  that  the  cognitive  processes  involved  the 
matching  task  were  the  same  used  in  the  transfer  task.  In  general,  the  demand 
characteristics  and  potential  set  effects  in  the  testing  phase  of  all  the  experiments  may 


99 


have  been  mitigated  by  a longer  delay  between  training  and  testing,  additional  distracter 
problems,  or  an  intervening  task. 

While  the  learning  mechanism  gave  a generally  accurate  account  of  transfer  and 
matching  performance  in  Experiments  1 and  2,  it  was  refuted  by  the  ability  of  the  CP3 
group  to  transfer  to  the  Remote  problem.  The  large  number  of  CP3  who  were  able  to 
match  the  Near  problem  with  the  alternative  problem  requiring  the  same  principle 
illustrated  that  strategies  other  than  that  assumed  by  the  mechanism  were  used  as  well. 
This  demonstrates  one  of  the  limitations  of  the  learning  mechanism.  The  purpose  was  to 
see  how  much  of  the  data  could  be  accounted  for  by  an  isolated  mechanism.  It  explicitly 
ruled  out  any  direct  comparisons  strategies.  It  also  ruled  out  any  feedback  loop  from 
training  which  is  necessary  for  learning  (Ohlsson,  1996). 

The  important  concept  in  these  studies  was  that  perceptual  information  serves  as  a 
stronger  retrieval  cue  than  conceptual  information  when  confronted  with  a novel  problem. 
As  long  as  perceptual  cues  reliably  categorize  and  predict  which  associated  rule  is 
successful,  they  will  dominate  the  retrieval  process. 

However,  if  perceptual  cues  do  not  reliably  predict  which  principle  is  involved, 
what  strategies  do  people  adopt?  Over  a set  of  problems  that  change  in  perceptual 
features,  solvers  may  perceive  subtle  but  consistent  information  that  offers  a dimension 
on  which  to  categorize  the  problem.  In  these  experiments,  this  information  was  semantic 
associations  among  between  terms. 

A prediction  directly  related  to  the  limitations  of  the  learning  mechanism  that  was 
not  tested  is  that  participants  would  apply  a principle  suggested  by  semantic  similarity 
even  if  it  violated  other  structural  properties  of  the  problem.  As  this  theory  focused  only 


100 


on  activation  and  matching  semantic  categories,  it  is  possible  that  the  semantics  could 
suggest  an  inappropriate  solution.  For  example,  would  the  VP  group  apply  the  structure 
of  the  difference  principle  in  a nonsense  fashion  such  as  Flow  = (heightl  - height2)  / 
distance?  This  would  demonstrate  they  can  map  variables  into  the  principles  but 
possessed  a limited  understanding  of  the  concepts  involved. 

Another  possible  study  concerns  the  shift  in  categorization  criteria  over  training. 

If  the  adaptation  of  memory  over  training  occurs  as  predicted,  it  would  be  possible  to 
observe  the  shift  in  transfer  strategies  from  surface  features  to  semantic  features  across 
training.  The  use  of  transfer  strategies  based  on  surface  features  should  taper  off  during 
training.  It  would  be  possible  to  test  this  over  training  to  see  if  it  was  true. 

In  these  transfer  and  matching  tasks,  categorization  was  a key  step.  By 
categorizing  a problem,  information  associated  with  the  category  such  as  familiar 
problems  and  associated  strategies  could  be  cued  in  memory  (Chi  et  al.,  1981;  Blessing  & 
Ross,  1996).  In  this  way,  categorizing  a problem  helped  to  define  the  problem  space 
(Newell  & Simon,  1972). 

Contextual  variability  was  offered  as  manipulation  that  would  cause  a shift  in 
categorization  criteria.  An  activation-based  learning  mechanism  provided  a simple 
explanation  for  the  shift  in  categorization  criteria  between  the  VP  and  CP  groups 
demonstrated  in  transfer  and  similarity  matching.  The  criteria  for  categorization  could 
effect  representation,  transfer,  and  other  problem  solving  processes. 

The  studies  presented  here  suggest  that  variation  in  the  surface  features  of  practice 
problems  helps  to  broaden  the  category  prototype  improving  transfer  and  the  recognition 


101 


of  embedded  principles.  Variation  in  problem  features  has  been  found  to  broaden 
category  membership  in  other  studies  as  well  (Smith  & Solman,  1994,  Hintzman,  1986). 

The  fact  that  the  CP  group,  and  to  some  degree  the  CP3  group,  did  not  categorize 
problems  based  on  embedded  principles  as  well  as  the  VP  group  suggested  differences  in 
their  internal  representation  of  the  problems.  In  the  present  studies,  each  trial  was 
considered  a learning  episode.  The  VP  group  was  exposed  to  a set  of  problems  that 
reinforced  a successful  strategy  where  problems  were  categorized  based  on  the  structural 
features  in  the  problem.  The  CP  and  CP3  groups  were  exposed  to  a set  of  problem  that 
could  be  successfully  categorized  based  on  the  surface  level  features. 

In  practice,  experts  repeatedly  encounter  categories  of  problems  with  specific 
characteristics.  Contextual  variation  is  often  coupled  with  the  development  of  expertise. 
The  results  in  this  study  suggest  that  variation  in  instances  of  the  problem  will  determine 
how  flexible  the  expert  is  at  recognizing  the  problem  under  different  circumstances  and, 
as  a result,  the  likelihood  they  will  transfer  their  skills.  For  problem  schemas  to  facilitate 
transfer,  they  must  be  identifiable  across  a range  of  situations.  Contextual  variability 
provides  for  this  and  is  implicit  in  some  theories  of  schema  formation  such  as  Holyoak’s 
pragmatic  schema  theory  (Holyoak  & Thagard,  1989)  and  Hintzman’s  Minerva 
(Hintzman,  1 986). 

Several  studies  have  examined  differences  between  experts  and  novices  in  transfer 
(Novick,  1988)  and  categorization  (Chi  et  al.,  1981;  Adelson,  1981).  These  studies  often 
conclude  that  experts  categorize  problems  according  to  structural  properties  while 
novices  group  problems  by  surface  level  features.  Experts  attend  to  information  that 
novices  to  not.  In  studies  of  chess,  experts  were  found  to  use  different  perceptual 


102 


strategies  than  novices  (Simon  & Chase,  1973).  These  differences  are  usually  attributed 
to  the  expert’s  reliance  on  large,  associated  bodies  of  knowledge  or  schemas. 

However,  schemas  alone  do  not  afford  transfer  across  domains.  Other  studies 
have  found  expertise  to  inhibit  transfer  of  skill  (Wiley,  1998;  Ericsson  & Chamess, 

1994).  While  schemas  improve  information  recognition,  storage,  and  retrieval,  they  may 
not  be  activated  in  situations  where  they  are  relevant.  Schemas  can  also  be  used 
inappropriately.  Experts  sometimes  fixate  on  using  certain  schemas  once  activated  and 
overlook  important  information  (Besnard  & Bastien-Toniazzo,  1999).  For  skill  to 
transfer,  the  conditions  under  which  the  opportunity  is  recognized  must  be  sufficiently 
flexible.  This  is  not  necessarily  inherent  in  expertise.  Nor  is  expertise  required  to 
develop  it. 

According  to  the  major  theories  of  analogical  transfer,  the  retrieval  of  base 
analogs  occurs  as  a similarity  match  between  the  solver’s  encoded  representation  of  the 
target  problem  and  the  representation  of  base  problems  stored  in  memory.  In  the 
pragmatic  schema,  structure  mapping,  and  exemplar  theories,  matches  between  the  target 
problem  at  hand  and  base  problems  in  memory  occur  at  any  level  of  abstraction.  Matches 
occur  across  surface  features,  semantic  similarities,  problem  constraints,  or  goals.  These 
theories  begin  to  differentiate  themselves  by  their  accounts  of  how  a single  analog  is 
selected  then  mapped  onto  the  target  problem. 

One  can  place  these  theories  on  a continuum  according  to  the  abstraction  of 
information  valued  by  the  solver  with  contextual  variability  as  the  transitional  agent. 
Contextual  variability  helps  explain  learning  and  retrieval  mechanisms  involved  in 
transfer  and  bridges  the  gap  between  the  exemplar  theories  and  structure- 


103 


mapping/pragmatic  schema  theories.  The  difference  between  these  low  level  and  high 
level  theories  may  be  in  what  type  of  information  in  the  problem  the  solver  places  the 
most  weight  on  and  the  range  of  exposure  they  have  had  to  similar  problems. 

Novice  solvers  exhibit  poor  transfer  in  unfamiliar  tasks  and  perform  as  predicted 
by  exemplar  theories  because  they  rely  on  the  easily  perceived  information  in  the 
problem.  The  surface  level  features  of  problems,  then,  play  a dominant  role  in  the 
retrieval  of  previously  learned  problems  or  category  prototypes.  This  is  a rational  strategy 
since  problems  that  have  similar  surface  features  often  have  similar  structural  features 
and  solutions.  As  a solver  is  exposed  to  more  variations  in  the  features  that  do  not  affect 
the  solution,  the  invariant  information  becomes  more  reliable  in  categorizing  and 
predicting  the  solution. 

Both  the  structure-mapping  and  pragmatic  schema  theories  presume  knowledge 
that  represents  high  level  information  about  the  problem  which  novices  may  not  possess. 
The  structure-mapping  theory  assumes  knowledge  that  represents  the  structural 
relationships  between  the  elements  in  a problem.  The  pragmatic  schema  theory  requires 
problem  schemata  that  organize  goal-relevant  and  solution  constraining  information. 

Experts  or  solvers  who  recognize  this  kind  of  structural  information  in  the 
problem  are  able  to  transfer  and  perform  as  predicted  by  structure  mapping  and  pragmatic 
schema  theories.  Their  attention  to  the  structural  information  in  the  problem  allows  them 
to  categorize  problems  and  draw  associations  beyond  the  scope  of  surface  features.  It 
also  cues  the  retrieval  of  problems  in  memory  that  share  solution  procedures.  Contextual 
variability  offers  one  way  in  which  solvers  implicitly  learn  to  attend  to  this  information. 


104 


Contextual  variability  falls  into  a larger  grouping  of  practice  effects  that  have  been 
shown  to  improve  skill  retention  and  generalization.  Another  practice  effect  that  has 
received  extensive  study  is  practice  order.  Studies  on  practice  order  have  mostly 
compared  how  the  practice  of  multiple  skills  should  be  ordered  for  maximum  retention. 
The  general  findings  show  that  training  sets  where  skills  are  randomly  mixed  led  to 
improved  retention  compared  to  training  where  skills  are  practiced  in  large  blocks.  While 
random  practice  improves  skill  retention,  it  usually  impedes  skill  acquisition  (Shea  & 
Morgan,  1979;  Schmidt  & Bjork,  1992).  The  explanation  often  cited  for  this  effect  is  that 
random  practice  forces  the  learner  to  retrieve  information  relevant  to  the  skill  on  each 
trial.  It  is  believed  that  retrieval  practice  helps  learners  remember  the  task  over  time. 

Common  between  studies  on  practice  order  and  contextual  variability  is  that  some 
training  manipulations  that  impede  performance  during  acquisition,  improve  skill 
retention  and  transfer.  Skill  retention  was  not  examined  in  these  studies  but  it  is  possible 
that  the  increased  processing  required  of  VP  group  during  training  would  lead  to  better 
retention  of  problem  solving  skills  compared  to  the  CP  and  CP3  groups.  However,  this 
remains  to  be  tested. 

The  findings  of  this  study  offer  practical  implications  for  training  of  problem 
solving  skills.  First,  performance  during  acquisition  provides  an  imperfect  measure  of 
learning.  While  the  performance  of  the  CP  and  CP3  groups  was  better  than  the  VP  group 
during  training,  the  VP  group  performed  either  faster  or  more  accurately  during  testing  in 
general. 

Several  theories  of  transfer  emphasize  that  schemas  afford  transfer.  Schemas  are 
also  believed  to  aid  expert  problem  solving.  It  is  possible  that  contextual  variability 


105 


during  practice  encourages  schema  development.  The  objective  of  any  training  program 
is  for  learners  to  apply  their  knowledge  to  different  problems.  Therefore,  transfer  tasks 
may  be  a good  way  to  assess  whether  or  not  learners  have  abstracted  a higher  level  of 
understanding  of  a problem. 

In  this  study,  two  principles  were  learned  that  related  to  flow.  The  group  who 
practiced  these  principles  in  a variety  of  contexts  was  better  able  to  recognize  them  in 
new  settings.  This  indicates  that  if  the  goal  of  training  is  to  be  able  to  apply  these 
principles  in  different  setting,  the  contexts  in  which  the  problems  are  learned  should  vary. 
If  the  goal  of  training  is  to  apply  the  principles  in  a variety  of  settings  within  the  same 
domain,  the  principles  do  not  need  to  be  practiced  in  different  domains  (although  that 
would  help  them  generalize).  However,  there  must  be  sufficient  variation  between 
practice  problems  to  de-emphasize  unimportant  and  potentially  misleading  surface 
features  and  bolster  recognition  of  the  causal  elements  in  the  problem. 

The  findings  of  this  study  support  the  claims  that  variation  in  practice  improves 
transfer  of  problem  solving  skills.  They  also  show  that  transfer  can  be  improved  when 
solvers  are  aware  of  a transfer  strategy  and  can  generalize  this  strategy  from  a simple 
transfer  problem.  Other  studies  have  found  ways  to  induce  spontaneous  transfer  by 
presenting  solvers  with  hints  (Gick  & Holyoak,  1 980),  illustrations  of  the  embedded 
principles  (Pedrone,  Hummel,  & Holyoak,  2001),  explicit  problem  solving  schemas 
(Clement,  1994),  and  summary  solutions  (Gick  & Holyoak,  1983).  The  contextual 
variability  and  Near  transfer  manipulations  also  improved  similarity  matching  based  on 


principles. 


106 


Together,  these  experiments  provide  a more  detailed  examination  of  the 
relationships  between  contextual  variability  and  transfer  than  previously  studied. 
Furthermore,  a model  of  activation-based  learning  could  provide  an  explanation  for  the 
shift  in  transfer  strategy  seen  between  experts  and  novices  under  certain  circumstances. 
The  implications  of  these  studies  are  important  for  current  theories  of  transfer  and 
provide  insights  into  the  differences  between  experts  and  novices  in  categorization, 
representation,  and  problem  solving. 


REFERENCES 


Adelson,  B.  (1981).  Problem  solving  and  the  development  of  abstract  categories 
in  programming  languages.  Memory  and  Cognition,  9,  422-433. 

Anderson,  J.  R.,  Lebiere,  C.  (1998).  The  atomic  components  of  thought.  Mahwah, 
NJ;  Erlbaum. 

Besnard,  D.,  Bastien-Toniazzo,  M.  (1999).  Expert  error  in  troubleshooting:  An 
exploratory  study  in  electronics.  International  Journal  of  Human-Computer  Studies,  50, 
391-405. 

Blessing,  S.  B.,  &.  Ross,  B.  (1996).  Content  effects  in  problem  categorization  and 
problem  solving.  Journal  of  Experimental  Psychology:  Learning,  Memory,  and  Cognition, 
22,792-810. 

Bransford,  J.  D.,  Franks,  J.  J.,  Morris,  C.  D.,  & Stein,  B.  S.  (1979).  Some  general 
constraints  on  learning  and  memory  research.  In  L.  S.  Cermack  & F.  I.  M.  Craik  (Eds.), 
Levels  of  processing  in  human  memory  (pp.  331-354).  Hillsdale,  NJ:  Erlbaum. 

Catalano,  J.  F.,  & Kleiner,  B.  M.  (1984).  Distant  transfer  in  coincident  timing  as  a 
function  of  variability  in  practice.  Perceptual  and  Motor  Skills,  58,  851-856. 

Catrambone,  R.  & Holyoak,  K.  J.  (1989).  Overcoming  contextual  limitation  on 
problem-solving  transfer.  Journal  of  Experimental  Psychology:  Human  Learning  and 
Memory,  15. 1147-1156. 

Chi,  M.  T.  H.,  Feltovich,  P.  J.,  & Glaser,  R.  (1981).  Categorization  and 
representation  of  physics  problems  by  experts  and  novices.  Cognitive  Science,  5,  121- 
152. 


Clement,  C.  (1994).  Effect  of  structural  embedding  on  analogical  transfer: 
Manifest  versus  latent  analogs.  American  Journal  of  Psychology,  107, 1-38. 

Elio,  R.,  & Anderson,  J.  R.  (1981).  The  effects  of  category  generalizations  and 
instance  similarity  on  schema  abstraction.  Journal  of  Experimental  Psychology:  Human 
Learning  & Memory,  6.  397-417. 

Ericsson,  K.  A.,  & Chamess,  N.  (1994).  Expert  performance:  Its  structure  and 
acquisition.  American  Psychologist,  49.  725-747. 


107 


108 


Feigenbaum,  E.  A.,  & Simon,  H.  A.  (1984).  EP AM-like  model  of  recognition  and 
learning.  Cognitive  Science.  8.  305-336. 

Centner,  D.  (1983).  Structure-mapping:  A theoretical  framework  for  analogy. 
Cognitive  Science.  7,  155-170. 

Centner,  D.  (1989).  The  mechanisms  of  analogical  reasoning.  In  S.  Vosniadou  & 
A.  Ortony  (Eds.),  Similarity  and  analogical  reasoning  (pp.  199-241).  Cambridge, 
England:  Cambridge  University  Press. 

Cick,  M.  L.,  & Holyoak,  K.  J.  (1980).  Analogical  problem  solving.  Cognitive 
Psychology,  12,  306-355. 

Cick,  M.  L.,  & Holyoak,  K.  J.  (1983).  Schema  induction  and  analogical  transfer. 
Cognitive  Psychology.  15, 1-38. 

Hardiman,  P.  T.,  Dufresne,  R.,  & Mestre,  J.  P.  (1989).  The  relation  between 
problem  categorization  and  problem  solving  among  experts  and  novices.  Memory  and 
Cognition.  17.  627-638. 

Hebb,  D.  O.  (1949).  The  organization  of  behavior.  New  York,  NY:  Wiley 

Hintzman,  D.  L.  (1986).  Schema  abstraction  in  a multiple-trace  memory  problem. 
Psychological  Review.  93,  411-428. 

Hintzman,  D.  L.  (1988).  Judgments  of  frequency  and  recognition  memory  in  a 
multiple-trace  memory  model.  Psychological  Review.  95,  528-55 1 . 

Holyoak,  K.  J.,  & Thagard,  P.  (1989a).  A computational  model  of  analogical 
problem  solving.  In  S.  Vosniadou  & A.  Ortony  (Eds.),  Similarity  and  analogical 
reasoning  (pp.  242-266).  Cambridge,  England:  Cambridge  University  Press. 

Holyoak,  K.  J.,  & Thagard,  P.  (1989b).  Analogical  mapping  by  constraint 
satisfaction.  Cognitive  Science.  13.  295-355. 

Kerr,  R.,  & Booth,  B.  (1978).  Specific  and  varied  practice  in  a motor  skill. 
Perceptual  and  Motor  Skills.  46.  395-401. 

Kimball,  & Holyoak,  K.  J.  (2000).  Transfer  and  expertise.  In  E.  Tulving  & F. 
Craik  (Eds.),  The  oxford  handbook  of  memory  (pp.  109-122).  Oxford,  England:  Oxford 
University  Press. 

Kotovsky,  K.,  Hayes,  J.  R.  & Simon,  H.  A.  (1985).  Why  are  some  problems  hard? 
Evidence  from  Tower  of  Hanoi.  Cognitive  Psychology,  1 7. 248-294. 


109 


Larkin,  J.  H.,  McDermott,  J.,  Simon,  D.,  & Simon,  H.  A.  (1980).  Expertise  and 
novice  performance  in  solving  physics  problems.  Science,  208,  1335-1342. 

Lesgold,  A.,  Rubinson,  H.,  Feltovich,  P.,  Glaser,  R.,  Klopher,  D.,  & Wang,  Y. 
(1988).  Expertise  in  a complex  skill;  Diagnosing  x-ray  pictures.  In  M.  T.  H.  Chi,  R. 
Glaser,  & M.  J.  Farr  (Eds.),  The  nature  of  expertise  (pp.  311-342).  Hillsdale,  NJ: 

Erlbaum. 

Mannes,  S.  M.,  & Kintsch,  W.  (1987).  Knowledge  organization  and  text 
organization.  Cognition  and  Instruction,  4,  91-115. 

Newell,  A.  & Simon,  H.  A.  (1972).  Human  problem  solving.  Englewood  Cliffs, 
NJ:  Prentice-Hall. 

Novick,  L.  (1988).  Analogical  transfer,  problem  similarity,  and  expertise.  Journal 
of  Experimental  Psychology;  Learning.  Memory,  and  Cognition,  14,  510-520. 

Ohlsson,  S.  (1996).  Learning  from  performance  errors.  Psychological  Reviey^ 
103,241-262. 

Pedone,  R.,  Hummel,  J.  E.,  & Holyoak,  K.  J.  (2001).  The  use  of  diagrams  in 
analogical  problem  solving.  Memory  & Cognition,  29, 2 1 4-22 1 . 

Reeves,  L.  M.,  & Weisberg,  R.  W.  (1994).  The  role  of  content  and  abstract 
information  in  analogical  transfer.  Psychological  Bulletin,  115,  381-400. 

Richman,  H.  B.,  Staszewski,  J.  L.,  & Simon,  H.  A.  (1995).  Simulation  of  expert 
memory  using  EPAM IV.  Psychological  Review,  102,  305-330. 

Rickard,  T.  C.,  Healy,  A.  F.,  & Bourne,  L.  E.  (1994).  On  the  cognitive  structure 
of  basic  arithmetic  skills;  Operation,  order,  and  symbol  transfer  effects.  Journal  of 
Experimental  Psychology;  Human  Learning  and  Memory,  20,  1 139-1 153. 

Ross,  B.  H.  (1984).  Remindings  and  their  effects  in  learning  cognitiye  skill. 
Cognitiye  Psychology.  16,  371-416. 

Ross,  B.  H.,  & Kennedy,  P.  T.  (1990).  Generalizing  from  the  use  of  earlier 
examples  in  problem  solving.  Journal  of  Experimental  Psychology:  Learning,  Memory, 
and  Cognition.  1 6. 42-55. 

Schmidt,  R.  A.,  & Bjork,  R.  A.  (1992).  New  conceptualizations  of  practice; 
Common  principles  in  three  paradigms  suggest  new  concepts  for  training.  Psycholoftic^ 
Science.  3. 207-217. 


no 


Schoenfeld,  A.  H.,  & Hermann,  D.  J.  (1982).  Problem  perception  and  knowledge 
structure  in  expert  and  novice  mathematical  problem  solvers.  Journal  of  Experiment 
Psychology.  T earning.  Memory,  and  Cognition,  8, 484-494. 

Shea  J B & Morgan,  R.  L.  (1979).  Contextual  interference  effects  on  the 
acquisition,  retention,  and  transfer  of  a motor  skill.  Journal  of  Experimental  Psychology: 
Human  Learning  and  Memory,  5,  179-187. 

Simon,  H.  A.,  & Chase,  W.  G.  (1973).  Skill  in  chess.  American  Scientist, .M. 
394-403. 

Smith,  E.  E.,  Solman,  S.  A.  (1994).  Similarity  versus  based  categorization. 
Memory  & Cognition,  22,  377  - 386. 

Sternberg,  R.  J.  (1996).  Costs  of  expertise.  In  K.  A.  Ericsson  (Ed.),  The  road  to 
excellence  (pp.  347-354).  Hillsdale,  NJ.  Erlbaum. 

Tversky,  A.  (1977).  Features  of  similarity.  Psychological  Review,  84,  327-352. 

Wiley,  J.  (1998).  Expertise  as  a mental  set:  The  effects  of  domain  knowledge  in 
creative  problem  solving.  Memory  and  Cognition,  24,  629-643 . 

Zamani,  M.,  & Richard,  J.  (2000).  Object  encoding,  goal  similarity,  and 
analogical  transfer.  Memory  & Cognition.  28,  873-886. 

Zhang,  J.  (1997).  The  nature  of  external  representations  in  problem  solving. 
Cognitive  Science.  21, 179-217. 


BIOGRAPHICAL  SKETCH 


Ryan  Terrell  West  was  bom  in  Southern  Pines,  North  Carolina,  in  1973.  Ryan 
graduate  from  Pinecrest  High  School  in  Southern  Pines  in  1991  and  attended  college  at 
the  University  of  North  Carolina  at  Charlotte.  There  he  became  interested  in  the  study  of 
cognition  through  his  experiences  as  a student  of  electrical  engineering  and  architecture. 

Following  graduation  with  a B. A.  in  psychology  in  1 996,  Ryan  began  his  graduate 
study  at  the  University  of  Florida.  He  received  his  Master  of  Science  degree  in  May  of 
1999  and  Doctor  of  Philosophy  degree  in  Cognitive  and  Sensory  Processes  in  August  of 
2002.  At  the  time  of  writing,  Ryan  works  as  a research  psychologist  for  Microsoft  and 
lives  in  Seattle,  Washington,  with  his  wife,  Chelsie,  and  dog,  LaRue. 


Ill 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 

Robert  D.  Sorkin,  Chairman 
Professor  of  Psychology 


1 certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 


Professor  of  Psychology 


1 certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 

Alan  C.  Spector^ 

Professor  of  Psychology 


1 certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 


Lise  Abrams 
Assistant  Professor  of 
Psychology 


1 certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 


Richard/3.  Melker 
Professpryof  Biomedical 
Enginwring 


1 certify  that  1 have  read  this  study  and  that  in  my  opinion  it  conforms  to  acceptable 
standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality,  as  a dissertation 
for  the  degree  of  Doctor  of  Philosophy. 


Assistant  Professor  of  Electrical 
and  Computer  Engineering 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of  the  Department  of  Psychology 
in  the  College  of  Liberal  Arts  and  Sciences  and  to  the  Graduate  School  and  was  accepted  as 
partial  fulfillment  of  the  requirements  for  the  degree  of  Doctor  of  Philosophy. 

August  2002 

Dean,  Graduate  School 


